Open brtkwr opened 3 years ago
Hey @brtknr - Sorry if these questions are obvious..
Hi @dustymabe
I tried booting up a regular Fedora Cloud 33 image on the same OpenStack deployment and that seems to get an IP address allocated correctly but that uses cloud-init ofcourse. I am not sure what's missing in the Fedora CoreOS bootstrap logic to accomodate this.
Looks like something has gone wrong with SLAAC: https://en.wikipedia.org/wiki/IPv6#Stateless_address_autoconfiguration_(SLAAC)
If you put in the mac address for the interface in question: fa:16:3e:51:33:e5
into https://www.vultr.com/resources/mac-converter/?mac_address=fa%3A16%3A3e%3A51%3A33%3Ae5, the expected Contained EUI-48 (U/L)
address is: f8:16:3e:ff:fe:51:33:e5
which matches the address on Neutron but Fedora CoreOS instance somehow renders this to a6:b5:33:c0:c5:5e:19:52
which is completely off.
I believe my issue is related to this actually: https://github.com/coreos/fedora-coreos-tracker/issues/513
On FCOS, NetworkManager is in charge of network configuration. That means that on your VM it is likely already handling ens3
, and thus possibly conflicting with your manual ip addr
changes.
I'm not surprised you are seeing weird results when mixing auto-configuration via NM and manual commands, and I would really recommend against doing the latter.
For further feedback on this ticket, it would be helpful to get the full logs from the NM service after a fresh boot and without further manual network tweaking. Also, the output of ip -6 addr
and a brief description of the underlying network infra would be helpful (it looks like it could be a SLAAC setup for a private ULA subnet?).
Hi @lucab, it looks like the default setting for the interface is stable-privacy
:
[core@k8s-devstack-v55mhdpu3cse-node-1 ~]$ nmcli connection edit "Wired connection 1"
===| nmcli interactive connection editor |===
Editing existing '802-3-ethernet' connection: 'Wired connection 1'
Type 'help' or '?' for available commands.
Type 'print' to show all the connection properties.
Type 'describe [<setting>.<prop>]' for detailed property description.
You may edit the following settings: connection, 802-3-ethernet (ethernet), 802-1x, dcb, sriov, ethtool, match, ipv4, ipv6, hostname, tc, proxy
nmcli> print ipv6.addr-gen-mode
ipv6.addr-gen-mode: stable-privacy
After changing this to 0
(based on this guide: https://ibert.tech/articles/activate-eui-64-on-ubuntu-desktop.html), I am now getting a stable address and the instance is reachable externally. Now my question is why this is stable-privacy
by default as this seems to render the instance unreachable via IPv6.
On FCOS, NetworkManager is in charge of network configuration. That means that on your VM it is likely already handling
ens3
, and thus possibly conflicting with your manualip addr
changes. I'm not surprised you are seeing weird results when mixing auto-configuration via NM and manual commands, and I would really recommend against doing the latter.For further feedback on this ticket, it would be helpful to get the full logs from the NM service after a fresh boot and without further manual network tweaking. Also, the output of
ip -6 addr
and a brief description of the underlying network infra would be helpful (it looks like it could be a SLAAC setup for a private ULA subnet?).
What appears to be happening is that the address generated by stable-privacy
setting seems to render the instance unreachable. It needs the address which can be generated via eui64
config to work.
I'm able to reproduce the same behavior on Vexxhost (openstack public cloud provider). Thanks @brtknr for reporting the issue.
You can see that the profile autogenerated by NM has ipv6.addr-gen-mode=stable-privacy
:
$ sudo cat "/run/NetworkManager/system-connections/Wired connection 1.nmconnection"
[connection]
id=Wired connection 1
uuid=4d2aec5c-0c3d-3fe4-8927-c3eb9d198d42
type=ethernet
autoconnect-priority=-999
interface-name=ens3
permissions=
timestamp=1627308807
[ethernet]
mac-address-blacklist=
[ipv4]
dns-search=
method=auto
[ipv6]
addr-gen-mode=stable-privacy
dns-search=
method=auto
[proxy]
[.nmmeta]
nm-generated=true
A few comments/questions for the broader community:
ipv6.addr-gen-mode
(eui64
or stable-privacy
)? Is the answer the same for all platforms? This might be worth it's own issue tracker discussion ticket.ipv6.addr-gen-mode
globally. After a brief discussion with the NM team we're going to re-visit and see if it's worth setting that in the global configuration: https://bugzilla.redhat.com/show_bug.cgi?id=1743161#c11WORKAROUND
Add this bit to your butane configs:
variant: fcos
version: 1.3.0
storage:
files:
- path: /etc/NetworkManager/system-connections/default.nmconnection
mode: 0600
contents:
inline: |
[connection]
id=Wired Connection
type=ethernet
autoconnect-retries=1
multi-connect=3
permissions=
[ethernet]
mac-address-blacklist=
[ipv4]
dhcp-timeout=90
dns-search=
method=auto
[ipv6]
addr-gen-mode=eui64
dhcp-timeout=90
dns-search=
method=auto
[proxy]
The file was generated with:
/usr/libexec/nm-initrd-generator -s -- ip=dhcp,dhcp6
and then the uuid
was removed.
Thanks @dustymabe for confirming the issue and the workaround, thats quite handy indeed! I will give it a whirl and get back to you.
I suspect this is basically client systems should use stable-privacy
, servers should use eui64
, right?
I suspect this is basically client systems should use
stable-privacy
, servers should useeui64
, right?
That makes sense to me
Luca pointed this out in the meeting today: Openstack does at least document this limitation: https://docs.openstack.org/neutron/wallaby/admin/config-ipv6.html#configuring-interfaces-of-the-guest
Is this a primarily server side OS or client side? I would put my guess on server side given the immutability etc. How do other clouds handle this issue?
Sent from my iPhone
On 4 Aug 2021, at 19:48, Dusty Mabe @.***> wrote:
Luca pointed this out in the meeting today: Openstack does at least document this limitation: https://docs.openstack.org/neutron/wallaby/admin/config-ipv6.html#configuring-interfaces-of-the-guest
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.
WORKAROUND
Add this bit to your butane configs:
variant: fcos version: 1.3.0 storage: files: - path: /etc/NetworkManager/system-connections/default.nmconnection mode: 0600 contents: inline: | [connection] id=Wired Connection type=ethernet autoconnect-retries=1 multi-connect=3 permissions= [ethernet] mac-address-blacklist= [ipv4] dhcp-timeout=90 dns-search= method=auto [ipv6] addr-gen-mode=eui64 dhcp-timeout=90 dns-search= method=auto [proxy]
The file was generated with:
/usr/libexec/nm-initrd-generator -s -- ip=dhcp,dhcp6
and then the
uuid
was removed.
This workaround did the trick btw, thank you
We touched this topic in the last meeting, and we want to look around a bit more before touching anything:
* ACTION: - dustymabe to figure out how the cloud edition is handling
the ipv6.addr-gen-mode=stable-privacy problem (dustymabe, 17:14:17)
On Fedora Cloud base cloud-init writes out this file:
$ cat /etc/sysconfig/network-scripts/ifcfg-eth0
# Created by cloud-init on instance boot automatically, do not edit.
#
BOOTPROTO=dhcp
DEVICE=eth0
HWADDR=fa:16:3e:24:3c:3b
IPV6INIT=yes
IPV6_AUTOCONF=yes
MTU=1500
ONBOOT=yes
TYPE=Ethernet
USERCTL=no
The default when creating a connection via D-BUS is stable-privacy
and the default when reading a keyfile/ifcfg-rh file from disk is eui64
. The difference in behavior was put in place to ease legacy migrations.
2021-07-26 10:54:43 @thaller explicit values "eui64" and "stable-privacy" (and confusingly, the default value differs whether the profile gets received via D-Bus, or loaded from keyfile/ifcfg-rh file).
...
2021-07-26 10:58:53 @thaller this was done so that when you create a profile on D-Bus (that is "now"), then the new default is "stable-privacy". If you have a profile on disk (created 5 years ago), then the default would stay at eui64.
TL;DR Fedora Cloud Base does not have this problem
Also, just in case this wasn't clear.. If I delete /etc/sysconfig/network-scripts/ifcfg-eth0
and reboot the F34 cloud base instance then we get the same config (i.e. dynamically created NM connections) and same behavior as FCOS.
The workaround (using butane configuration) is a viable alternative for OKD4 based nodes (using variant: openshift
, instead of variant: fcos
), but is not helping when using networkType:OVNKubernetes
.
Reference https://bugzilla.redhat.com/show_bug.cgi?id=1743161#c14
We discussed this in the community meeting today.
12:56:15 dustymabe | #agreed We will work with the NetworkManager team to get in place a
| configuration setting for a default value for ipv6.addr-gen-mode and
| apply that to all of FCOS when it's ready.
12:56:15 dustymabe | #agreed In the shorter term we may try to find some other way to set it
| without requiring the NetworkManager feature to be implemented.
12:56:15 dustymabe | #agreed We'll also reach out to FESCO and try to convince the rest of
| Fedora that `stable-privacy` makes most sense in a workstation/laptop
| setting and we should apply eui64 as the default for all server like
| variants.
Sounds like it was a productive meeting!
12:56:15 dustymabe | #agreed We'll also reach out to FESCO and try to convince the rest of | Fedora that `stable-privacy` makes most sense in a workstation/laptop | setting and we should apply eui64 as the default for all server like | variants.
That sounds like the proper solution to me.
https://bugzilla.redhat.com/show_bug.cgi?id=1743161 has been closed as WONTFIX. Is there another way to set a default for this setting?
Had a discussion this morning with the NM team. Lubomir in particular had some strong pushback against reverting to eui64
for server like editions in Fedora:
Reverting
the defaults back to obsolete EUI-64 method is a complete no-go. It has been
deprecated for very good reasons, IETF's position is detailed in [RFC 8064].
Apart from the privacy issues, the problems of EUI-64 mechanism affecting
the servers are:
* The EUI-64 identifiers, being based on hardware address change with
replacement on interface cards. Not great on servers.
* The mechanism produces a single address. On DAD failures, the machine
ends up with not connectivity.
If the machine needs a predictable address, either the provisioning should
assign a static address or utilize DHCPv6. For the cases where this wouldn't
be possible (and the environment is controlled enough for the various issues
with EUI-64 don't apply, e.g. virtual networks), we are willing to provide
a way to switch defaults to EUI-64 in system configuration, but we're
strongly opposed to making this any sort of default.
I'll be joining the meeting in an hour, happy to discuss this further.
Pointers to relevant RFCs, for reference:
[RFC 7217] A Method for Generating Semantically Opaque Interface
Identifiers with IPv6 Stateless Address Autoconfiguration (SLAAC)
<https://www.rfc-editor.org/rfc/rfc7217>
[RFC 8064] Recommendation on Stable IPv6 Interface Identifiers
<https://www.rfc-editor.org/rfc/rfc8064>
However it was brought up by other participants in the meeting that in some cases this does cause an issue where the hypervisor or cloud platform might not know the address being used by the instance. I believe under qemu, if qemu-guest-agent is installed (we don't have that installed in FCOS), it could be picked up that way. The NM team would like to understand cases where the hypervisor/platform mismatch exist so they can further understand the problem here.
What we did agree on was that we could get a global NM configuration knob for configuring this. They asked me to open a new BZ (not re-open BZ#1743161). I did that here: BZ#2082682
The upstream change for BZ#2082682 landed in NetworkManager 1.39.8+
We are now unblocked to move forward with setting a global default at least in rawhide.
An example butane config that sets the global default:
variant: fcos
version: 1.4.0
storage:
files:
- path: /etc/NetworkManager/conf.d/90-ipv6-addr-gen-mode-override.conf
mode: 0600
contents:
inline: |
[connection-90-ipv6-addr-gen-mode-override]
match-device=type:ethernet
ipv6.addr-gen-mode=0
We discussed this at our community meeting today.
It seems that given new information there isn't as much support for changing the global default for all FCOS platforms. For OpenStack specifically we want to do a little more investigation to see if we can dynamically determine if the OpenStack env is set up for IPv6 SLAAC and see if we can dynamically set the global configuration default in that case. There may be a chicken and egg issue there though.
13:23:59* dustymabe | #action jlebon to reach out to OpenStack experts to see if we can
| detect when the platform is expecting machines to do IPV6 network
| configuration via SLAAC (to get eui64 based IPv6 addresses)
We'll then make the determination if we want to set it conditionally or unconditionally on OpenStack.
Tangentially another point was brought up that may influence our decision to set a global default for this. I've opened https://github.com/coreos/fedora-coreos-tracker/issues/1266 to continue that discussion.
I've reached out to Rodolfo Hernandez (thanks so much!) who works on OpenStack Neutron. The TL;DR is:
Given the above, I think we can investigate changing the default only for OpenStack. There was a doubt however about whether setting ipv6.addr-gen-mode=eui64
even if DHCPv6 is used can cause any issues. From my reading of the docs, that's not the case but it'd be good to test it to confirm or reach out to NM folks.
There was a doubt however about whether setting ipv6.addr-gen-mode=eui64 even if DHCPv6 is used can cause any issues. From my reading of the docs, that's not the case but it'd be good to test it to confirm or reach out to NM folks.
Nothing comes to mind. From NetworkManager manages point of view, with ipv4.addr-gen-mode=eui64|stable-privacy
is always generates some IPv6 interface identifier. The value that it generates has nor further significance and should not have any relation with DHCPv6. Well sure, different link local and SLAAC addresses get generated, but that is probably not a cause for problems.
If you find a problem, please report :)
We discussed this in the community meeting today.
13:41:17 dustymabe | #agreed we will set ipv6.addr-gen-mode=eui64 as the
| default on our OpenStack platform since the platform
| expects this to be the case. We will attempt to leave
| currently deployed systems alone so that we don't
| change an existing system's IP address.
I hate to ask, but for anyone, including myself, in which version is this fix included?
@MindTooth According to this bugzilla NetworkManager-1.39.10-1.el8
contains the implementation that allows setting ipv6.addr-gen-mode
in global config.
Exoscale provider is facing the same issue, initial report https://github.com/coreos/fedora-coreos-tracker/issues/513
Following https://github.com/coreos/fedora-coreos-tracker/issues/907#issuecomment-1226134528 Is there any hope to see this change landing any time soon ?
Describe the bug I am attempting to spawn VMs on OpenStack with IPv6 enabled. The interface appears but the IP address is incorrect. For example, instead of
fd5e:d3bb:de2e:0:f816:3eff:fe9d:1342/64
, the IP address that is rendered isfd5e:d3bb:de2e:0:6f88:6036:cd38:c7cc/64
.Reproduction steps Steps to reproduce the behavior:
openstack server create --image fedora-coreos-34.20210529.3.0-openstack.x86_64 --flavor ds2G --key-name default --network private test-fcos
sudo ip addr add fd5e:d3bb:de2e:0:f816:3eff:fe9d:1342/64 dev ens3
to make IPv6 work on the instanceExpected behavior IPv6 should work out of the box.
Actual behavior IPv6 address is rendered incorrectly and as a result doesnt work.
System details
Ignition config No config provided.
Additional information I tried the suggestion here: https://github.com/coreos/fedora-coreos-tracker/issues/888#issuecomment-878426854 with a reboot but that didn't help my situation. Happy to tmate for anyone who wants to inspect the issue.