Closed jeffaco closed 5 years ago
I can now answer question 2 above:
We made a change to the switch fabric in Gen4 where ethernet interfaces are tied to specific purposes. We did that for traffic routing purposes, so we can more easily control exactly what traffic goes over what port. The network design folks never tell me about these things in advance - sigh.
I'd still like answers to question 1 and 3 above, though. I'm hoping for an optional mac_address
entry in the network configuration to allow specification of the mac address, perhaps something like this:
networking:
-
interface: eth0
mac_address: "00:25:b5:1a:00:0d"
vlan: 50
vlan_mtu: 1500
ip: 172.16.34.31
gateway: 172.16.34.1
subnet_mask: 255.255.255.0
mtu: 1500
I'm open to suggestions, though. Thoughts?
Once modified, is the only option to reboot? Is there some way to dynamically re-read this file?
persistent udev network rules only apply at udev start and in this case even on kernel network driver load. So no, there is no other way than reboot. That rules should exist at reboot time and I can only offer to add those to the image description for the Very Large Instance build or do you see that on the Large Instance image deployed to you too ?
How can we modify the YAML to generate a correct file the first time so that the network is reachable after first boot?
This should not happen on the yaml level. You request interface names per mac address and that's pure udev/kernel level creating the interface names.
Last but not least you marked that as a bug on our side but I don't see how this is a bug ? your fabric changed and that influenced the enic rules. We can adapt our image build but to be honest this type of changes are quite painful to any system that deals with persistent network interface names
Deleting the bug flag and setting the discussion flag as I think we can only help out on the udev rule setup which I'm not sure if you want. Any next change in your fabric will again cause this trouble
Some more info on this.
The eth* names are assigned, as Marcus already pointed out at boot time by the kernel. The names are assigned i.e. eth0 to eth? based on the order of discovery, or stated otherwise based on when a udev event is triggered. A udev event is triggered when the device becomes available from the HW side to the kernel. The order maybe different at every boot, so hard coding the MAC address into the ifcfg-ethX file is risky as the next time the hard coded MAC address might not match with the interface name which means the interface will not be brought up by wicked.
One way to avoid this problem is to switch to predictable names [1] for network interfaces. This scheme has a different set up problems, but those problems would not apply in our use case. The image is always deployed on the same hardware Gen 3 or Gen 4 and based on that it is known where the interfaces are and what they should be named. Thus, if the YAML config would write the interface names as predictable names and we switch the image build to use predictable names we can avoid the stated problem in the original posting. Instead of
interface: eth0
you would then have
interface: ens1
for example. If we switch to predictable names we do not need to make any changes to the initialization code or the YAML syntax to handle both Gen 3 and Gen 4 in a consistent way.
- Once modified, is the only option to reboot? Is there some way to dynamically re-read this file?
Yes
- How come the file is generated properly (100% of the time) on Gen3, but not on Gen4?
Luck
How can we modify the YAML to generate a correct file the first time so that the network is reachable after first boot?
We cannot the file you are listing is not touched by the initialization code. The initialization code only writes the ifcfg-eth* files. Those files determine how wicked brings up the interfaces.
[1] https://www.freedesktop.org/wiki/Software/systemd/PredictableNetworkInterfaceNames/
Ooh, shudder. I didn't realize this was such a sticky issue. This, by the way, is a high priority issue for us since it affects LI, which affects the bulk of our deployments. (We don't have VLI images yet anyway.)
I gave this issue a "bug" label because the network was not coming up properly. Please accept my apology if you don't consider it a bug.
I read over the posting on Predictable Network Interface Names, so I understand that much, at least. This will bleed over to customer visibility, and perhaps software (not sure), so I'll need to drag other folks in to weigh in on any potential solutions.
This is a general change in Gen4, so whatever we come up with should handle both LI and VLI, as both will be affected.
I had number of questions and some comments:
Comment: Ultimately, we're tying firewall rules (on network fabric) to interfaces (in the O/S).
Question: Does the interface name in the O/S really matter?
eth0
or ens1
, that's just a human label for it, right? After all, traffic is routed to whatever interface by route rules, so you just access the network (at some I/P address), and it "just works". Right? From software, other than configuring the network, does it matter what the name of the interface is?ens1
), this is customer visible. So a bunch of folks here will need to sign off if we pick something like this.What we've done before in the past (what we were doing before SUSE-generated images):
/etc/udev/rules.d/70-persistent-net.rules
and reboot, which allows the network to come up./etc/udev/rules.d/70-persistent-net.rules
file. But even with that file deleted, the network names are extremely consistent after that. I don't personally understand how this could be, unless udev writes to some other file allowing predictable ordering once the names are established (as long as underlying hardware doesn't change)./etc/udev/rules.d/70-persistent-net.rules
file, and on first boot of that new image (on new hardware), the interface names are always correct. This, honestly, is a little beyond my understanding, but the network folks said this was so, so I believe them, even if I don't fully understand how it's so.So, given all this, suggestions? How can we predictably tie the O/S interfaces to the network fabric interfaces if not using some common piece of information (i.e. mac address)?
Question: Does the interface name in the O/S really matter?
Yes and No
From software, other than configuring the network, does it matter what the name of the interface is?
No
Regarding interface name changes (i.e. ens1), this is customer visible
Yes, if the customer runs "ip a" or looks at /etc/sysconfig/network. I would not necessarily call this customer visible.
would a "quick reboot" suffice
Yes, a kexec will re-enumerate all the devices.
How can we predictably tie the O/S interfaces to the network fabric interfaces if not using some common piece of information (i.e. mac address)?
That's the problem predictable network interfaces solves, as implied by your question using "predictably". The interface on a given bus in a given slot always has the same name no matter in which order the interface is detected. Based on your explanation of UCS that would apply. UCS should apply the same configuration to a new blade, i.e. the same network interface with the same MAC shows up in the same position (bus and slot) on a new blade.
The "MAC address" while common knowledge is not anything the kernel uses. The kernel couldn't care less what the MAC address of a device is when an interface is first detected as present.
If we do not use predictable names and we add the MAC address to the YAML then we will end up in the following situation.
That the removal of /etc/udev/rules.d/70-persistent-net.rules file works is luck nothing else. The code that writes this file runs on every reboot and for every rule that is not in /etc/udev/rules.d/70-persistent-net.rules a new rule will be generated, i.e. the file gets re-created on every boot if it does not exist. The generation code also runs when a new network interface is added.
So if we have to stick to persistent names, i.e. use the "ethX" naming then the initialization code can/has to modify "/etc/udev/rules.d/70-persistent-net.rules" and in that file assign the names based on MAC address as we want them to. This would then be persistent across all boots as the information from /etc/udev/rules.d/70-persistent-net.rules will be applied on every boot and device names will be properly renamed to match the information in this file. This approach means
The real solution to the problem described is to use predictable interface names, sticking with persistent names just creates a cascading effect of work around solutions that introduce a potential race condition. Every race condition will eventually be met for unexplained reasons and then the system will have no working network.
I strongly suggest that we move to predictable names for this setup.
Thanks for the detailed reply.
To be clear, if you guys feel that predictable names makes sense, I'm fine with that (and will push that solution on our end since, to stakeholders, they may not like the fact that there is a new naming convention).
However, I'm still a little vague on what ties this to network fabric. If the network fabric has a notion of eth0-eth5, each with specific Mac addresses (I know the kernel doesn't really care about Mac addresses), what ties the network fabric's notion of eth0
with the O/S notion of ens1
?
Here's the network fabric's notion of a blade:
So, in this example, what ties the network fabric's notion of eth0
with ens1
, if we move to predictable names? As I said, I am not adverse to predictable names at all. I just would like to understand how it would work ...
Could we get a test image to see if it actually works in our environment?
OK, I see the confusion. Again these are just names. The network fabric happens to name them as eth? name but that really has (should have) nothing to do with what the kernel names the devices.
What would make sense, and I am not saying that UCS is implemented that way, is that you give a network interface a certain name in the fabric setup and you associate that with a certain MAC address. Then the fabric assigns that MAC address to a given network card in the blade.
So let's say we have a blade system with 3 NICs. From a hardware perspective all blades are the same and the NICs are all attached to the same bus in the same order. Meaning the network fabric has some notion that the NIC in slot one is "eth0", the NIC in slot two is "eth1" and the NIC in slot three is "eth2". So when you assign "mac-A" to eth0 this means that the NIC in slot 1 will get "mac-A" from the fabric. "mac-B" would go to the NIC in slot 2 and "mac-C" to the NIC in slot 3. However this should have nothing to do with the way the kernel perceives the NICs. As far as the kernel is concerned the NIC in slot 1 gets a name according to the rules we use "persistent" (eth?) or "predictable" (ens?). If we use persistent names on the kernel side the name may or may not match with what's set up in the fabric. As discussed that depends on the order of udev events, i.e. jitters in the HW initialization. If the NIC in slot 2 happens to show up first it will get the "eth0" name in the kernel and will have the "eth1" name in the fabric. Basically what you observed. ANd what started this discussion.
Now if the YAML is generated based on what's in the fabric things will me cross wired, again this is consistent with the observed behavior.
In the above I say "should" because I do not know what he Cisco kernel driver modules do and if they do or do not establish some correlation to the fabric. Although that would be weird.
Given the behavior you showed at the beginning of this thread I would say the Cisco drivers do not establish such weird connections to the fabric.
So from a configuration perspective it is actually easier for you to generate the YAML based on the fabric information because you can establish the mapping that "eth0" in the fabric is the NIC in the first slot on some bus in the blade and that will always be the same and so if we use persistent names there is very little that can go wrong.
So from a configuration perspective it is actually easier for you to generate the YAML based on the fabric information because you can establish the mapping that "eth0" in the fabric is the NIC in the first slot on some bus in the blade and that will always be the same and so if we use persistent names there is very little that can go wrong.
That sounds awesome to me. So, if I'm understanding you properly, if we move to persistent names, then regardless of startup "jitter", ens1
will always be what the network fabric sees as eth0
, like this:
Fabric NIC | OS persistent name |
---|---|
eth0 | ens1 |
eth1 | ens2 |
eth2 | ens3 |
eth3 | ens4 |
eth4 | ens5 |
eth5 | ens6 |
If that's the case, that sounds awesome, and would solve the issue completely.
I guess I have some remaining questions:
Is this a change to the O/S, or just a YAML change? If the later, what do I change? If the former, then:
Test image possible? I'm thinking "yes" since you are the professionals in image building 😃
Out of curiosity, why weren't existing interfaces eth0
... eth(n)
just set up as persistent names by default in the Linux kernel, thus avoiding this issue to begin with? Backwards compatibility issues?
So from a configuration perspective it is actually easier for you to generate the YAML based on the fabric information because you can establish the mapping that "eth0" in the fabric is the NIC in the first slot on some bus in the blade and that will always be the same and so if we use persistent names there is very little that can go wrong.
That sounds awesome to me. So, if I'm understanding you properly, if we move to persistent names,
Oops my bad, we would switch to "predictable" names.
then regardless of startup "jitter",
ens1
will always be what the network fabric sees aseth0
, like this:Fabric NIC OS persistent name eth0 ens1 eth1 ens2 eth2 ens3 eth3 ens4 eth4 ens5 eth5 ens6
Yes that would be the map. We just have to determine whether the interfaces are really "ens" or some other name.
If that's the case, that sounds awesome, and would solve the issue completely.
I guess I have some remaining questions:
- Is this a change to the O/S, or just a YAML change?
It is a change to:
but NOT a change to the YAML schema or setup code.
If the later, what do I change?
You would write
interface: ens1
instead of
interface: eth0
If the former, then:
- Test image possible? I'm thinking "yes" since you are the [professionals in image building]
You'll have a SLES 12 SP4 For SAP test image tomorrow
(https://github.com/SUSE-Enceladus/azure-li-services/pull/116#issuecomment-462470148) smiley
- Out of curiosity, why weren't existing interfaces
eth0
...eth(n)
just set up as persistent names by default in the Linux kernel, thus avoiding this issue to begin with? Backwards compatibility issues?
Well the debate about "persistent" (eth0) vs. "predictable" (ens1) names is long and it is a stony road. The argument for predictable names and why they make sense is clearly displayed here. However "predictable" is not "predictable " ahead of time. Meaning I cannot tell you what the names of the interfaces in the UCS blades will actually be. One has to know the internals of the HW to predict the names. So I cannot predict what the interface on a given piece of HW will be named until I have done at least one installation and let the "predictable" name logic figure it out. Yes, there is also the compatibility issue and many years of scripts that do special stuff based on the "knowledge" that there will be something called "eth0". Shudder, but that is reality. Also in many environments there is only 1 NIC so it's always eth0 and jitter doesn't matter.
Lots of information since I left the desk yesterday. So yes predictable network interface names are the solution to this problem. I will move the image descriptions now to activate net.ifnames properly and I think we are good with the other open PR. So the image will have both issues addressed. If all works out of the box is something I can't tell and we need your help and feedback to come to the final solution
stay tuned
Devel Images have all been updated to use predictable network interface names
ImagesSLE124P
ImagesSLE12
ImagesSLE15
Predictable names are - well - yucky, particularly in the fact that they are not all that predictable. The problem with predictable names:
On a Gen4 UCS Test Node:
Sollabdsm31:~ # for i in 0 1 2 3 4 5; do echo "----- For eth$i -----"; udevadm test-builtin net_id /sys/class/net/eth$i 2>/dev/null | grep '^ID_NET_NAME_'; done
----- For eth0 -----
ID_NET_NAME_MAC=enx0025b51b000e
ID_NET_NAME_PATH=enp72s0
----- For eth1 -----
ID_NET_NAME_MAC=enx0025b51a000e
ID_NET_NAME_PATH=enp73s0
----- For eth2 -----
ID_NET_NAME_MAC=enx0025b51b0038
ID_NET_NAME_PATH=enp74s0
----- For eth3 -----
ID_NET_NAME_MAC=enx0025b51a000f
ID_NET_NAME_PATH=enp80s0
----- For eth4 -----
ID_NET_NAME_MAC=enx0025b51b0037
ID_NET_NAME_PATH=enp81s0
----- For eth5 -----
ID_NET_NAME_MAC=enx0025b51a000d
ID_NET_NAME_PATH=enp82s0
Sollabdsm31:~ #
So if we're avoiding the Mac-based ethernet addresses, that leaves enp* interfaces (72s0-74s0 and 80s0-82s0). Okay.
However, on a second Gen4 UCS test node:
Sollabdsm32:~ # for i in 0 1 2 3 4 5; do echo "----- For eth$i -----"; udevadm test-builtin net_id /sys/class/net/eth$i 2>/dev/null | grep '^ID_NET_NAME_'; done
----- For eth0 -----
ID_NET_NAME_MAC=enx0025b51b0012
ID_NET_NAME_PATH=enp200s0
----- For eth1 -----
ID_NET_NAME_MAC=enx0025b51b0011
ID_NET_NAME_PATH=enp201s0
----- For eth2 -----
ID_NET_NAME_MAC=enx0025b51a0014
ID_NET_NAME_PATH=enp202s0
----- For eth3 -----
ID_NET_NAME_MAC=enx0025b51a0013
ID_NET_NAME_PATH=enp208s0
----- For eth4 -----
ID_NET_NAME_MAC=enx0025b51a0012
ID_NET_NAME_PATH=enp209s0
----- For eth5 -----
ID_NET_NAME_MAC=enx0025b51a0011
ID_NET_NAME_PATH=enp210s0
Sollabdsm32:~ #
This is totally unexpected, as the interface names are different from blade to blade. This is awful, and implies that we would need to put on O/S on each and every blade just to figure out what the interfaces would be. And if we needed to move a profile to a different blade (due to blade failure, say, which UCS supports), networking would be problematic. Yuck!
Why, on one node, is eth0 -> enp72s0, while on the other node eth0 -> enp200s0? I really expected these to be consistent from UCS blade to UCS blade ...
I'm actually thinking that Mac-Address based names might be better, although I'm having different issues with that one (however, I think these issues might be able to be resolved, will need to check with networking folks). At least I can reasonably predict that if I know the Mac address ...
Is there a way that predictable names can be predictable from UCS blade to UCS blade?
This is totally unexpected, as the interface names are different from blade to blade.
It means your blades are different.
eth4: enp209s0
eth4: enp81s0
The card on one system is at pci bus location 209 (slot 0) and the card on another system is on pci bus location 81 (slot 0). The fact that both got assigned eth4 in the past is pure luck because the device ordering when the kernel sees it is non deterministic.
predictable names for the network cards assumes you know the location where on the PCI bus the card is plugged in. If there is no logic for us to know that it's gonna be very hard. The logic you used before is unstable as Robert already explained.
So the question here is:
If the answer is that all is different from one blade to another then we need a host specific information in the yaml file. Which in other words means you need to know the interface name or the bus location per instance. I guess this is a cluster and the selection of the blade that actually runs the system is at another level of the infrastructure ?
Houston we have a problem
Well we can try and write ifcfg-enx* files. But I do not know what wicked does in that case, so that needs to be tested. The only we apparently know is the MAC address as that gets assigned by the UCS network fabric software. Then the YAML would contain
`interface: enx0025b51b0012'
for example and the setup could would write "ifcfg-enx0025b51b0012". But I have no idea if wicked will find that interface based on this name. With predictable names entry for "enp200s0" will exist in /sys/class/net but no entry based on MAC address will exist. I don't think the MAC based identification will work, but it is worth a test.
I asked Marius about this...
However how would that help ? Any card has a unique MAC address. This would also mean each blade has to have a dedicated yaml file with interface: enx0025b51b0012
or we create a udev rule that maps any potentially existing mac address to an interface name. None of this looks like a nice solution to me
The MAC address is set by the network fabric of the USC system and is known as @jeffaco has shown in an earlier comment.
Anyway I had another idea this morning about how to solve this problem. We can rewrite the persistent-netrules file in a save way. Here is my proposal https://github.com/SUSE-Enceladus/azure-li-services/blob/netEnum/azure_li_services/nic_enumeration.py
I only implemented the core logic. full integration with tests and service setup etc needs to be completed if we agree on this approach. The YAML would then get a "mac" entry.
The MAC address is set by the network fabric of the USC system and is known as @jeffaco has shown in an earlier comment.
ok so that means the MAC is the stable factor here and will become a mandatory setup in the network section
I got further information from Marius. In theory (untested) we could provide a udev rule in our images that does this
/etc/udev/rules.d/79-net-rename-mac.rule
SUBSYSTEM=="net", ACTION=="add", NAME=="", ENV{ID_NET_NAME_MAC}!="",
NAME="$env{ID_NET_NAME_MAC}"
This should allow us to use the mac based interface name. A rewrite of the net-rules to ethX would then be not required and the interface name would explicitly be named according to its mac. I haven't tested this but if it works I would prefer this approach
Will do some testing
Ok I had some success in my testing. I would go the following way:
In our image descriptions we adapt 80-net-setup-link.rules to create interfaces based on the MAC
Move from ID_NET_NAME to ID_NET_NAME_MAC
This will result in interface names looking like this:
2: enxdeadbeefb8c2: <BROADCAST,MULTICAST> mtu 9000 qdisc pfifo_fast state DOWN group default qlen 1000
link/ether de:ad:be:ef:b8:c2 brd ff:ff:ff:ff:ff:ff
And allows us to create ifcfg- configurations based on the mac address name
The yaml file needs to specify network configurations as such
networking:
interface: enxdeadbeefb8c2
I have tested this and it worked well for me. So no rewriting of rules required in my opinion and no potential race condition on the timing waiting for the network interface names to be rewritten
Thoughts ?
I was going to try using MAC addresses as the interface name. Robert had asked for that to be tested, and it was on my list (right behind testing why, on Gen4, the disk wasn't resized - I didn't forget about that).
He was skeptical that would work, however.
So I guess this mechanism definitely works? But we'd need a new image to test with (because of the image description change)? I did get a test image to work with predictable names (like enp209s0
, although this turns out to not be predictable at all). Would that image work, or is this change somewhat more involved?
I heard back from Cisco (I asked them why their predictable network device names weren't predictable, referring them to this post).
They responded:
Thanks for the background. Just curious if you have try the consistent device name(CDN) in the vnic template or directly in the service profile? Here is the documentation for it.
You also need to enable CDN control in the BIOS policy before enabling CDN in the VNIC template/service profile. Reboot the server to take effect.
If you search for CDN
on that page, you get to the relevant stuff. And sure enough, that page discusses the very thing that I'm encountering:
When there is no mechanism for the Operating System to label Ethernet interfaces in a consistent manner, it becomes difficult to manage network connections with server configuration changes. Consistent Device Naming (CDN), introduced in Cisco UCS Manager Release 2.2(4), allows Ethernet interfaces to be named in a consistent manner. This makes Ethernet interface names more persistent when adapter or other configuration changes are made.
The default behavior for CDN on UCS is disabled
, and that's what we're currently using. This can be changed, however. BUT: I noted on Cisco's link that SLES doesn't appear to be supported? The supported O/S list is:
It's interesting that RHEL is supported but SUSE is not, since they use common kernels (although the kernel configuration might be different).
CDN might fit the bill better because we wouldn't have to worry about doing automation to get the MAC addresses. Of course, CDN isn't an option on our VLI systems, where we'd presumably need to use predictable names there anyway. But I suspect that, on VLI, the names will actually be predictable. I can't be certain yet since I only have access to one Gen4 VLI system.
This terminology is a little numbing - regular network names, predictable names (based on MAC address or other factors), consistent names ...
So, some questions:
Why are CDN (Consistent Device Names) not supported by SLES? Will they work, or does this take engineering effort that RedHat did that SUSE did not?
Are CDN names a better option for us (since the names sound predictable from blade to blade, and move with the profile)? It sounds like they are, but only if they'll work for SLES ...
Thanks for your thoughts!
He was skeptical that would work, however. So I guess this mechanism definitely works?
yes, it does
But we'd need a new image to test with
correct, the one we gave you use predictable names but based on PCI location not based on MAC. As you nicely explained, based on PCI location is not really predictable. So yes a new image would be needed.
Why are CDN (Consistent Device Names) not supported by SLES?
I can't answer this question. Either Robert knows why or I will be asking the right people next week when I'm in NUE
Are CDN names a better option for us
It would be less work from your side. As you said if we go with the MAC assignment the information "which MAC per interface" is an information you would need to provide in the yaml per instance. If you can trust the system with CDN to provide the same interface names for any instance that would lower the work on your side. However as we are in the enterprise business I would not use features from the kernel that are flagged as not supported. I know many technologies works no matter what their official support status is but we should not go that route imho
I think we should stick with the predictable names based on MAC address, test image now available. That is straight forward and does not need any features on the system that have questionable support status.
As far as CDN is concerned I suspect that may simply be a documentation or a test issue. Since CDN is determined by BIOS setting this should apply equally to all Linux distributions. After all reading the firmware information is pretty much standard.
JFI: Had a conversation with Marius about CDN (bios device names) and he also recommended to not use them. The reason here is simple. The names are presented by the BIOS to the system. This means it depends on the BIOS itself if we get them and it also depends on the BIOS if they are correct. Any change on that level with run through the system and will cause harm on our implementation.
I have activated the ID_NET_NAME_MAC in our devel image builds. It is done with an additional rule 81-net-setup-link.rules
which comes directly after 80-net-setup-link and rewrites the interfaces to their mac based representation. Also tested the setup and adapted my integration test build.
So from my perspective all coding work for this issue is done.
I booted what I believe is the latest test image. Networking was not up at all. Here's the output of ifconfig -a
:
enx0025b5 Link encap:Ethernet HWaddr 00:25:B5:1A:00:15
BROADCAST MULTICAST MTU:9000 Metric:1
RX packets:0 errors:0 dropped:18 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:144 (144.0 b) TX bytes:0 (0.0 b)
enx0025b5 Link encap:Ethernet HWaddr 00:25:B5:1A:00:16
BROADCAST MULTICAST MTU:9000 Metric:1
RX packets:0 errors:0 dropped:18 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:144 (144.0 b) TX bytes:0 (0.0 b)
enx0025b5 Link encap:Ethernet HWaddr 00:25:B5:1A:00:17
BROADCAST MULTICAST MTU:9000 Metric:1
RX packets:0 errors:0 dropped:18 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:144 (144.0 b) TX bytes:0 (0.0 b)
enx0025b5 Link encap:Ethernet HWaddr 00:25:B5:1A:00:18
BROADCAST MULTICAST MTU:9000 Metric:1
RX packets:0 errors:0 dropped:18 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:144 (144.0 b) TX bytes:0 (0.0 b)
enx0025b5 Link encap:Ethernet HWaddr 00:25:B5:1B:00:0F
BROADCAST MULTICAST MTU:9000 Metric:1
RX packets:0 errors:0 dropped:18 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:144 (144.0 b) TX bytes:0 (0.0 b)
enx0025b5 Link encap:Ethernet HWaddr 00:25:B5:1B:00:10
BROADCAST MULTICAST MTU:9000 Metric:1
RX packets:0 errors:0 dropped:18 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:144 (144.0 b) TX bytes:0 (0.0 b)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:10 errors:0 dropped:0 overruns:0 frame:0
TX packets:10 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:660 (660.0 b) TX bytes:660 (660.0 b)
I'm pretty sure I'm using the right image due to the enx
devices in the above output.
I ended up grabbing the network configuration files in /etc/sysconfig/network
, and those weren't what I expected at all. Here's a ifcfg.tar.gz with the contents of that directory. It was still configuring the eth*
devices, which I didn't expect - I thought those files would be mentioning the MAC addresses.
In case it's relevant, here's the suse_firstboot_config.yaml file.
Let me know if you need additional information, thanks.
By the way: Because networking failed to come up, there was a deployment error in storage (couldn't mount storage devices).
I would have expected to see an error file from this in the config LUN, but did not:
Sollabdsm33:~ # ls /mnt/yaml3/
lost+found rpms scripts ssh suse_firstboot_config.yaml
Sollabdsm33:~ #
The console clearly showed a deployment error, however. What happened here? Why no logging details? The configuration disk was clearly mounted since other things were set (accounts, etc).
@jeffaco what did the YAML look like for the attempt with the failed network setup? Also, since the names of the interfaces are now long please us the ip a
command so we can see the full device names and not the names shortened by ifconfig.
I included the YAML in my original message with the results, I think you missed that.
Output from ip a
command:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: enx0025b51a0015: <BROADCAST,MULTICAST> mtu 9000 qdisc noop state DOWN group default qlen 1000
link/ether 00:25:b5:1a:00:15 brd ff:ff:ff:ff:ff:ff
3: enx0025b51a0016: <BROADCAST,MULTICAST> mtu 9000 qdisc noop state DOWN group default qlen 1000
link/ether 00:25:b5:1a:00:16 brd ff:ff:ff:ff:ff:ff
4: enx0025b51a0017: <BROADCAST,MULTICAST> mtu 9000 qdisc noop state DOWN group default qlen 1000
link/ether 00:25:b5:1a:00:17 brd ff:ff:ff:ff:ff:ff
5: enx0025b51a0018: <BROADCAST,MULTICAST> mtu 9000 qdisc noop state DOWN group default qlen 1000
link/ether 00:25:b5:1a:00:18 brd ff:ff:ff:ff:ff:ff
6: enx0025b51b000f: <BROADCAST,MULTICAST> mtu 9000 qdisc noop state DOWN group default qlen 1000
link/ether 00:25:b5:1b:00:0f brd ff:ff:ff:ff:ff:ff
7: enx0025b51b0010: <BROADCAST,MULTICAST> mtu 9000 qdisc noop state DOWN group default qlen 1000
link/ether 00:25:b5:1b:00:10 brd ff:ff:ff:ff:ff:ff
Let me know what else you need, thanks.
Thanks, OK those things match up, so we need to look at what wicked did. @jeffaco we'll need
ls /etc/sysconfig/network
then we'll want to know what wicked produced as far as messages is concerned. Use journalctl -u
for each of these services:
wicked.service wickedd-auto4.service wickedd-dhcp4.service wickedd-dhcp6.service wickedd-nanny.service wickedd.service
In the above message, I said:
I ended up grabbing the network configuration files in
/etc/sysconfig/network
, and those weren't what I expected at all. Here's a ifcfg.tar.gz with the contents of that directory. It was still configuring theeth*
devices, which I didn't expect - I thought those files would be mentioning the MAC addresses.
If I understand what you're after, couldn't you have gotten the results of the ls
command from the ifcfg.tar.gz file? And the contents of the files, too, in case that was relevant? Or maybe I don't fully understand what you're after.
In any case, here's the output from each of the commands you asked for. I did the ls
command with -l
to insure you could differentiate from directories and files.
Output from ls -l /etc/sysconfig/network
:
total 92
-rw-r--r-- 1 root root 9692 Feb 16 12:15 config
-rw-r--r-- 1 root root 13520 Feb 16 12:16 dhcp
drwxr-xr-x 2 root root 6 Jun 27 2017 if-down.d
drwxr-xr-x 2 root root 27 Feb 16 12:15 if-up.d
-rw-r--r-- 1 root root 85 Feb 21 00:12 ifcfg-eth0
-rw-r--r-- 1 root root 146 Feb 21 00:12 ifcfg-eth0.250
-rw-r--r-- 1 root root 87 Feb 21 00:12 ifcfg-eth1
-rw-r--r-- 1 root root 148 Feb 21 00:12 ifcfg-eth1.251
-rw-r--r-- 1 root root 87 Feb 21 00:12 ifcfg-eth2
-rw-r--r-- 1 root root 148 Feb 21 00:12 ifcfg-eth2.252
-rw-r--r-- 1 root root 87 Feb 21 00:12 ifcfg-eth3
-rw-r--r-- 1 root root 148 Feb 21 00:12 ifcfg-eth3.253
-rw------- 1 root root 147 Dec 5 14:14 ifcfg-lo
-rw-r--r-- 1 root root 21738 Oct 14 2016 ifcfg.template
-rw-r--r-- 1 root root 29 Feb 21 00:12 ifroute-eth0.250
drwx------ 2 root root 6 Jun 27 2017 providers
drwxr-xr-x 2 root root 97 Feb 16 12:15 scripts
Output from journalctl -u wicked.service
:
-- Logs begin at Thu 2019-02-21 20:17:04 UTC, end at Thu 2019-02-21 20:30:01 UTC. --
Feb 21 20:18:02 Sollabdsm31 systemd[1]: Starting wicked managed network interfaces...
Feb 21 20:18:32 Sollabdsm31 wicked[4686]: lo up
Feb 21 20:18:32 Sollabdsm31 wicked[4686]: eth0 no-device
Feb 21 20:18:32 Sollabdsm31 wicked[4686]: eth0.250 no-device
Feb 21 20:18:32 Sollabdsm31 wicked[4686]: eth1 no-device
Feb 21 20:18:32 Sollabdsm31 wicked[4686]: eth1.251 no-device
Feb 21 20:18:32 Sollabdsm31 wicked[4686]: eth2 no-device
Feb 21 20:18:32 Sollabdsm31 wicked[4686]: eth2.252 no-device
Feb 21 20:18:32 Sollabdsm31 wicked[4686]: eth3 no-device
Feb 21 20:18:32 Sollabdsm31 wicked[4686]: eth3.253 no-device
Feb 21 20:18:32 Sollabdsm31 systemd[1]: Started wicked managed network interfaces.
Output from journalctl -u wickedd-auto4.service
:
-- Logs begin at Thu 2019-02-21 20:17:04 UTC, end at Thu 2019-02-21 20:30:01 UTC. --
Feb 21 20:18:01 Sollabdsm31 systemd[1]: Starting wicked AutoIPv4 supplicant service...
Feb 21 20:18:01 Sollabdsm31 systemd[1]: Started wicked AutoIPv4 supplicant service.
Output from journalctl -u wickedd-dhcp4.service
:
-- Logs begin at Thu 2019-02-21 20:17:04 UTC, end at Thu 2019-02-21 20:30:01 UTC. --
Feb 21 20:18:01 Sollabdsm31 systemd[1]: Starting wicked DHCPv4 supplicant service...
Feb 21 20:18:01 Sollabdsm31 systemd[1]: Started wicked DHCPv4 supplicant service.
Output from journalctl -u wickedd-dhcp6.service
:
-- Logs begin at Thu 2019-02-21 20:17:04 UTC, end at Thu 2019-02-21 20:30:01 UTC. --
Feb 21 20:18:01 Sollabdsm31 systemd[1]: Starting wicked DHCPv6 supplicant service...
Feb 21 20:18:01 Sollabdsm31 systemd[1]: Started wicked DHCPv6 supplicant service.
Output from journalctl -u wickedd-nanny.service
:
-- Logs begin at Thu 2019-02-21 20:17:04 UTC, end at Thu 2019-02-21 20:30:01 UTC. --
Feb 21 20:18:02 Sollabdsm31 systemd[1]: Starting wicked network nanny service...
Feb 21 20:18:02 Sollabdsm31 systemd[1]: Started wicked network nanny service.
Output from journalctl -u wickedd.service
:
-- Logs begin at Thu 2019-02-21 20:17:04 UTC, end at Thu 2019-02-21 20:30:01 UTC. --
Feb 21 20:18:01 Sollabdsm31 systemd[1]: Starting wicked network management service daemon...
Feb 21 20:18:02 Sollabdsm31 systemd[1]: Started wicked network management service daemon.
Let me know if you need more information, thanks so much for your help!
That's why the network is not up, the files are named ifcfg-ethX but the interface names are enx0025b51a0015. Not sure why the service generated the wrong interface file names. What should be in /etc/sysconfig/network is
ifcfg-enx0025b51a0015
for example.
Yeah, I saw that and noted that in my quote above.
Any idea why that happened? A problem with my YAML (misunderstanding of what to put, perhaps?), or a problem with the code?
@jeffaco Sorry for making you double work, yes I missed half of what you said in the earlier comment :( I guess I was too distracted by the truncation from ifconfig.
Anyway I looked at the code and it doesn't care what the name is. The code creates the name of the ifcfg- files based on the value of "interface". In theory things should match up.
Okay, I'll let you or Marcus take a closer look at the code to figure out why theory doesn't match up with reality.
I'm trying to bring up the interfaces (at least the client network) by:
ifcfg-eth0
to ifcfg-enx0025b51a0015
ifcfg-eth0.250
to ifcfg-enx0025b51a0015.250
and editing both DEVICE
and ETHERDEVICE
in that file, andifroute-eth0.250
to ifroute-enx0025b51a0015
and editing the file to contain:default 10.60.0.1 - enx0025b51a0015.250
After that, I restart wicked
and the network doesn't come up. Makes it a pain in the tush as I'm tied to the console:
I didn't change the other interfaces (I thought I'd do that when I could connect via SSH). Any idea what I'm doing wrong?
After renaming the files, just run "ifup enx0025b51a0015" and "ifstatus enx0025b51a0015"
It's not coming up properly: ifup enx0025b51a0015
yields:
wicked: Rejecting suspect interface name: enx0025b51a0015.250
enx0025b51a0015 up
But then output of ip a
is as above, and I don't have a network where I can ping the gateway. If I try to ping the gateway, I get: connect: Network is unreachable
. Any suggestions?
I've reached out to the networking team
Just saw your e-mail, let's wait for feedback from Marius. I haven't seen this error in my kvm based test which is interesting. The vlan id in my tests is '0' or '1'...
Some refactoring in our network code is required due to a size limit in network interface names. I will adapt the code according to the thread we had with Marius
Hey, another issue came up with this test image:
This test image doesn't work with old YAML files. That is, if a YAML file specifies eth0 (for old behavior), it appears to still pick some sort of MAC address, and then doesn't come up for networking.
Old behavior should be consistent (with an old YAML file) to still work in Gen3. I'd like the new behavior (setting up interfaces like enx0025b51a0015
) to only take place if I specify an interface like that in the YAML.
Is this possible?
This test image doesn't work with old YAML files. That is, if a YAML file specifies eth0 (for old behavior), it appears to still pick some sort of MAC address, and then doesn't come up for networking. Old behavior should be consistent (with an old YAML file)
The yaml file is not the driver here. Whether or not you get eth0 vs. enx interface names is controlled by the kernel bootoption:
net.ifnames=1
The image we (Robert) sent you has this option enabled to get you started with mac based interface names. If you want to go back you need to pass:
net.ifnames=0
To the kernel when it boots. There is no way for us to control this on the yaml level.
I'm concerned that you consider to go back to non predictable names. Actually that would only be a safe choice if only one interface exists on the machine.
It would also be nice if we stay focused on the issue here, meaning dealing with multiple interfaces and assign them correctly. I'm in the process to create an image for you which includes fixes for all the reported issues and I hope testing on your side will be consistently using mac based setup. I'd like to get this fixed and tested. Once all is good we can talk about for which systems you want predictable interface names and for which you don't want them.
Makes sense ?
Makes sense, Marcus. The concern I have with predictable names is that I'm required to offer the MAC address. I think predictable names will likely work fine as is (without MAC addresses) on VLI systems, and regular network names (eth0
, etc) always worked fine in Gen3 (which we still support).
Thus, my thought was this:
Platform | Network Name Conventions |
---|---|
Gen3 | Previous (working) config: eth0 ... ethx |
Gen4 (LI/UCS) | Predictable names with MAC addresses |
Gen 4 (VLI) | Predictable names without MAC addresses (enp... ) |
That's my thoughts. Any objections? If we do this, how can we specify the kernel boot parameters? Or will we have to use different images in these cases (I'd really prefer not to do that)?
Using predictable names on VLI based n location rather than MAC is not an issue.
Sure, but: how do we manage Gen3 platforms?
How can we specify the kernel boot parameters? Or will we have to use different images in Gen3 vs. Gen4 (I'd really prefer not to do that)?
Let's leave this issue opened until we completely understand how this will work in Gen3 (LI) and Gen4 (LI/VLI) given the above table ...
Gen3 LI will have to transition to the new MAC based scheme
Ooh, I'm not sure that's possible. Is that our only option? I'll need to take this to the team - we may end up not using this capability of that's the case.
Let me know if this is the only option, thanks. It would be super awesome if, somehow, the YAML could be used to determine the naming scheme ...
As @schaefi it is a kernel configuration option and has nothing to do with the YAML. What you are asking is to build separate images for Gen3 and Gen4, so double the image count for LI. Sorry that's not an option.
Gen3 Previous (working) config: eth0 ... ethx
You could have been on a lucky path here so far. There is no guarantee that the order of interfaces is persistent between boot cycles of the machine. As I said ethX naming is imho only a safe option if there is only one interface available. As soon as there are more you are on an unstable path with eth naming as it is done by the order of the devices as they appear on the kernel, and that order is not guaranteed.
Gen3 LI will have to transition to the new MAC based scheme
From my point of view the only stable solution
In Gen3, we supply a YAML, boot the system, and bingo, the network is up.
In Gen4, this is not the case. After investigation, we figured out that this is tied to file
/etc/udev/rules.d/70-persistent-net.rules
. For some reason, the pairings always seem to be correct in Gen3, but always seem to be incorrect in Gen4. I'm not quite sure why this is.A "corrected" file is like this:
As per the instructions, we change the
NAME
field to be correct for the system, reboot, and we're good.So, questions: