canonical / lxd

Powerful system container and virtual machine manager
https://canonical.com/lxd
GNU Affero General Public License v3.0
4.38k stars 930 forks source link

Add validation for ipv6 prefix size #7404

Closed asbachb closed 4 years ago

asbachb commented 4 years ago

dnsmasq is not working properly when using a subnet smaller than /64. This causes that periodical router advertisements are not sent which might expires the default route on guest.

Since a huge amount of hosters assign only /64 subnets the user should be prevent or at least warned to configure a subnet smaller than /64.

See https://discuss.linuxcontainers.org/t/ipv6-route-gets-invalidated-but-not-renewed/7883 for some more details.

asbachb commented 4 years ago

cc: @tomponline

stgraber commented 4 years ago

So only an issue with RA + stateless DHCPv6 but shouldn't be an issue with stateful DHCPv6?

stgraber commented 4 years ago

I'm somewhat wary of us doing actual hard validation on this as that would entirely break any existing user and would prevent using LXD with IPv6 and a shorter subnet when the network does not rely on dnsmasq RAs.

We should log a warning though indicating that dnsmasq and some IPv6 features may not work and document something similar in networks.md

tomponline commented 4 years ago

@stgraber shouldnt be, but also needs research as part of this as OP originally experienced issues with stateful DHCPv6 with a /120 and /65 subnet.

The dnsmasq manpage only says:

-F, --dhcp-range=[tag:<tag>[,tag:<tag>],][set:<tag>,]<start-IPv6addr>[,<end-IPv6addr>|constructor:<interface>][,<mode>][,<prefix-len>][,<lease time>]

...
For IPv6, the parameters are slightly different: instead of netmask and broadcast address, there is an optional prefix length which must be equal to or larger then the prefix length on the local interface. If not given, this defaults to 64. Unlike the IPv4 case, the prefix length is not automatically derived from the interface configuration. The minimum size of the prefix length is 64.
...

In the OP's case it was using:

--dhcp-range 2a01:4f8:xxxx:xxxx::2,2a01:4f8:xxxx:xxxx::ff,120,5m

Which seems to go against what the manual says.

It may be worth logging with upstream dnsmasq, as perhaps it should fail to start in that scenario.

tomponline commented 4 years ago

@stgraber we should also explore the relationship (or if any is needed) between ipv6.dhcp.expiry and setting dnsmasq's ra-param interval, in case one can set these to a scenario where the lease expires before the next advert.

stgraber commented 4 years ago

Or someone should fix dnsmasq :)

That's a somewhat arbitrary limit... there's no reason why RAs shouldn't work on < /64. All that should do is give you a default gateway, forcing IP allocation to happen through DHCPv6.

tomponline commented 4 years ago

@stgraber can we keep this open for now so that we can investigate dnsmasq behaviour more?

tomponline commented 4 years ago

I can't recreate this locally on focal with lxd master or snap.

Confirmed with DHCPv6 stateful enabled, addresses and routes were refreshed when using a /120 subnet.

As expected without stateful DHCPv6 enabled, using a /120 subnet resulted in no SLAAC addresses or routes.

tomponline commented 4 years ago

Suspect original poster's issue was caused by having overlapping IPs defined on multiple interfaces.

asbachb commented 4 years ago

@tomponline I can reproduce this on a fresh 20.04 hetzner installation:

root@ubuntu-2gb-nbg1-2:~# apt install snapd
root@ubuntu-2gb-nbg1-2:~# snap install lxd
root@ubuntu-2gb-nbg1-2:~# lxd init

root@ubuntu-2gb-nbg1-2:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 96:00:00:54:ae:27 brd ff:ff:ff:ff:ff:ff
    inet 116.203.246.196/32 scope global dynamic eth0
       valid_lft 67817sec preferred_lft 67817sec
    inet6 2a01:4f8:c0c:eb2a::1/64 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::9400:ff:fe54:ae27/64 scope link 
       valid_lft forever preferred_lft forever
3: lxdbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether fe:90:00:34:63:9e brd ff:ff:ff:ff:ff:ff
    inet 10.45.95.1/24 scope global lxdbr0
       valid_lft forever preferred_lft forever
    inet6 2a01:4f8:c0c:eb2a::2/120 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::b491:79ff:fe22:a414/64 scope link 
       valid_lft forever preferred_lft forever

root@ubuntu-2gb-nbg1-2:~# lxc network show lxdbr0
config:
  ipv4.address: 10.45.95.1/24
  ipv4.nat: "true"
  ipv6.address: 2a01:4f8:c0c:eb2a::2/120
  ipv6.dhcp: "true"
  ipv6.dhcp.stateful: "true"
  ipv6.nat: "false"
description: ""
name: lxdbr0
type: bridge
used_by:
- /1.0/instances/topical-catfish
managed: true
status: Created
locations:
- none

root@ubuntu-2gb-nbg1-2:~# lxc launch images:ubuntu/focal c1
root@ubuntu-2gb-nbg1-2:~# lxc shell c1

root@c1:~# ip -6 r
2a01:4f8:c0c:eb2a::/120 dev eth0 proto ra metric 100 expires 2127sec pref medium
2a01:4f8:c0c:eb2a::/120 dev eth0 proto kernel metric 256 expires 2125sec pref medium
fe80::/64 dev eth0 proto kernel metric 256 pref medium
default via fe80::b491:79ff:fe22:a414 dev eth0 proto ra metric 100 expires 1800sec mtu 1500 pref medium

[wait....]

root@c1:~# ip -6 r
2a01:4f8:c0c:eb2a::/120 dev eth0 proto ra metric 100 expires 1771sec pref medium
2a01:4f8:c0c:eb2a::/120 dev eth0 proto kernel metric 256 expires 1769sec pref medium
fe80::/64 dev eth0 proto kernel metric 256 pref medium

If you want access to the machine feel free to get in touch - I'll add your public key.

stgraber commented 4 years ago

That does line up with @tomponline's comment that you have the same subnet on two interfaces though which could confuse dnsmasq.

2a01:4f8:c0c:eb2a::1/64 contains 2a01:4f8:c0c:eb2a::2/120

tomponline commented 4 years ago

Ill test my theory tomorrow.

asbachb commented 4 years ago

That does line up with @tomponline's comment that you have the same subnet on two interfaces though which could confuse dnsmasq.

2a01:4f8:c0c:eb2a::1/64 contains 2a01:4f8:c0c:eb2a::2/120

Yeah. You're right.

root@ubuntu-2gb-nbg1-2:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 96:00:00:54:ae:27 brd ff:ff:ff:ff:ff:ff
    inet 116.203.246.196/32 scope global dynamic eth0
       valid_lft 84488sec preferred_lft 84488sec
    inet6 2a01:4f8:c0c:eb2a::1/128 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::9400:ff:fe54:ae27/64 scope link 
       valid_lft forever preferred_lft forever
3: lxdbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 2e:17:ff:98:e9:35 brd ff:ff:ff:ff:ff:ff
    inet 10.45.95.1/24 scope global lxdbr0
       valid_lft forever preferred_lft forever
    inet6 2a01:4f8:c0c:eb2a::2/120 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::d4ff:62ff:fe85:2bad/64 scope link 
       valid_lft forever preferred_lft forever
5: vethc87e6752@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master lxdbr0 state UP group default qlen 1000
    link/ether 2e:17:ff:98:e9:35 brd ff:ff:ff:ff:ff:ff link-netnsid 0
7: veth6c11522b@if6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master lxdbr0 state UP group default qlen 1000
    link/ether ba:5d:e8:aa:5c:97 brd ff:ff:ff:ff:ff:ff link-netnsid 1

Works a expected.

tomponline commented 4 years ago

So I think I've found a way to work around that. By using a /112 subnet and making sure it doesn't overlap with the low part of the subnet assigned to the host in the /64.

E.g.

On the LXD host:

ip a
2: enp5s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 00:16:3e:0e:2b:95 brd ff:ff:ff:ff:ff:ff
    inet 10.109.89.68/24 brd 10.109.89.255 scope global dynamic enp5s0
       valid_lft 1813sec preferred_lft 1813sec
    inet6 fd42:d37c:f0f2:a5f:216:3eff:fe0e:2b95/64 scope global dynamic mngtmpaddr noprefixroute 
       valid_lft 3205sec preferred_lft 3205sec
    inet6 fe80::216:3eff:fe0e:2b95/64 scope link 
       valid_lft forever preferred_lft forever
3: lxdbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether da:69:49:67:43:1c brd ff:ff:ff:ff:ff:ff
    inet 10.91.155.1/24 scope global lxdbr0
       valid_lft forever preferred_lft forever
    inet6 fd42:d37c:f0f2:a5f::1:1/112 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::d882:ffff:fe16:a096/64 scope link 
       valid_lft forever preferred_lft forever
lxc network show lxdbr0
config:
  ipv4.address: 10.91.155.1/24
  ipv4.nat: "true"
  ipv6.address: fd42:d37c:f0f2:a5f:0:0:1:1/112
  ipv6.dhcp.stateful: "true"
  ipv6.nat: "true"
description: ""
name: lxdbr0
type: bridge
used_by:
- /1.0/instances/c1
managed: true
status: Created
locations:
- none

This then allows DHCPv6 stateful with RA to work correctly (I can see RA arrive and IP addresses are renewed).