canonical / cloud-init

Official upstream for the cloud-init: cloud instance initialization
https://cloud-init.io/
Other
2.77k stars 839 forks source link

add an option to disable dhcpv6 on fallback #5380

Open samuel-gauthier opened 1 month ago

samuel-gauthier commented 1 month ago

Enhancement

Since PR https://github.com/canonical/cloud-init/pull/4474, dhcpv6 is always enabled in fallback mode. Would it be possible to have a global option to disable it?

I have problems with the fallback mode in eni mode, using ifupdown. The following configuration is generated:

auto ens3
iface ens3 inet dhcp
# control-alias ens3
iface ens3 inet6 dhcp

If there is no ipv6 server listening on the interface, the networking service fails with "[FAILED] Failed to start Raise network interfaces.".

holmanb commented 1 month ago

Hi @samuel-gauthier, thanks for raising this issue. Which distro are you using?

You proposed providing a key to configure this setting, but from cloud-init's perspective we'd rather not make this something that the user has to configure and instead make it "just work" correctly as expected regarless of whether an ipv4 or ipv6 environment is used.

Since clouds are moving towards dual stack designs, cloud-init is expected to provide both ipv6 and ipv4. On more modern network backends this configuration translates to "move on once an IP is gained", which seems like a sensible default. Unfortunately I'm not familiar enough with ifupdown's user interface to suggest how this should work. Do you know how this could work with ifupdown?

samuel-gauthier commented 1 month ago

Hi @holmanb, thanks for checking this!

I'm using Ubuntu 22.04 and 24.04. I agree, it makes sense to enable both IPv4 and IPv6 from a general perspective.

I looked into it for a while and did not find a documented way to configure ifupdown to match the behavior you suggested, but maybe I missed it.

I did think of a workaround, by adding a cloud-init dhclient hook that would write a file on dhcp failure, and then use a pre-up script like:

pre-up if [ -f /tmp/dhcpv4_eth0_failed ]; then ifup eth0 inet6 dhcp; fi

But it seemed fragile, and it would not be efficient if only DHCPv6 was enabled as ifupdown would have to wait for DHCPv4 to fail before trying DHCPv6. Also, I don't know if the networking service will be happy with it in the end.

holmanb commented 1 month ago

I'm using Ubuntu 22.04 and 24.04.

This works on netplan, why not use that?

I did think of a workaround, by adding a cloud-init dhclient hook that would write a file on dhcp failure, and then use a pre-up script like:

pre-up if [ -f /tmp/dhcpv4_eth0_failed ]; then ifup eth0 inet6 dhcp; fi

But it seemed fragile, and it would not be efficient if only DHCPv6 was enabled as ifupdown would have to wait for DHCPv4 to fail before trying DHCPv6. Also, I don't know if the networking service will be happy with it in the end.

Agreed, this might work for a one-off solution but I don't think that we'd want to maintain it.

samuel-gauthier commented 1 month ago

The daemons used by netplan get in the way when doing scalability (for instance when many interfaces are added, the daemon listens to netlink and it's very costly) or advanced networking.

I did not find another solution that does not involve adding an option in /etc/network/interfaces...

holmanb commented 1 month ago

The daemons used by netplan get in the way when doing scalability (for instance when many interfaces are added, the daemon listens to netlink and it's very costly) or advanced networking.

What is the measured cost that you see? It would be good to have a reproducer or characterization of the system where netplan is too costly for many interfaces (how many? what architecture?).

samuel-gauthier commented 1 month ago

For instance, when we add many interfaces, each daemon listening to the netlink interface messages will be costly. I don't think there is a particular problem with systemd-networkd implementation, it just adds a lot to the system load in this case. You can try adding 5k interfaces for instance, and check the systemd-networkd load (I have 100% for some time): for i in $(seq 1 5000); do ip link add vrf$i type vrf table 10$i; done

Ifupdown is lightweight and stateless, which is nice in this particular use case, which is one of the reasons why I would like to keep using it.

holmanb commented 1 month ago

For instance, when we add many interfaces, each daemon listening to the netlink interface messages will be costly. I don't think there is a particular problem with systemd-networkd implementation, it just adds a lot to the system load in this case. You can try adding 5k interfaces for instance, and check the systemd-networkd load (I have 100% for some time): for i in $(seq 1 5000); do ip link add vrf$i type vrf table 10$i; done

Ifupdown is lightweight and stateless, which is nice in this particular use case, which is one of the reasons why I would like to keep using it.

+1 thanks for explaining, that helps a lot

holmanb commented 1 month ago

I'm using Ubuntu 22.04 and 24.04.

Do you see this behavior on both 22.04 and 24.04?

I did think of a workaround, by adding a cloud-init dhclient hook that would write a file on dhcp failure, and then use a pre-up script like:

Are you sure that you are using dhclient on both 24.04 and 22.04? It looks like the preferred client order is dhclient, udhcpc, dhcpcd. However in 24.04 cloud-init switched from using dhclient to dhcpcd as a dependency, so you would have to have manually installed dhclient (or had some other dependency which made isc-dhclient get installed).

samuel-gauthier commented 1 month ago

I am working with Ubuntu 22.04, and will use Ubuntu 24.04 in the future. Sorry I was unclear.

holmanb commented 1 month ago

Also, what platform / cloud / datasource are you using?

samuel-gauthier commented 1 month ago

I am using several, the problem is seen when no networking configuration is provided. I currently disabled the ubuntu configuration to make sure that netplan is not used, and force the use of the fallback (to generate /etc/network/interfaces.d/50-cloud-init.cfg).

holmanb commented 1 month ago

@samuel-gauthier thanks for the responses. Just curious, have you tested this with ifupdown-ng?

I've gotten reports of this working correctly in an ipv4-only environment with both dhclient and dhcpcd using ifupdown-ng on alpine.

Since this renderer appears to be working currently on some platforms, I agree with your original assessment that an option to disable dhcpv6 on fallback would be the ideal solution - probably scoped to the eni renderer given that the other network backends seem to be able to handle this.

samuel-gauthier commented 1 month ago

@holmanb, I did try ifupdown-ng, which did not work out of the box on Ubuntu 22.04. I did not take the time to investigate more.