Closed ubuntu-server-builder closed 1 year ago
Launchpad user Frank Heimes(fheimes) wrote on 2020-03-20T10:39:28.255787+00:00
Launchpad attachments: subiquity test - focal live 19032020_s390x_LPAR.txt
Launchpad user Frank Heimes(fheimes) wrote on 2020-03-20T10:40:11.854229+00:00
Launchpad attachments: 20032020_DASD_LPAR.tgz
Launchpad user Frank Heimes(fheimes) wrote on 2020-03-20T12:11:35.573646+00:00
This issue btw. does not happen to me with a z/VM that is not attached to a VLAN environment. In an non-VLAN env. (here on z/VM) I see this file on the installed system: /etc/netplan/50-cloud-init.yaml with proper content:
$ cat /etc/netplan/50-cloud-init.yaml
network: ethernets: enc600: addresses:
And the network is working after the post-install reboot.
Launchpad user Frank Heimes(fheimes) wrote on 2020-03-20T13:04:09.113213+00:00
I just see that this 'could' be a duplicate of: LP 1861460 https://bugs.launchpad.net/bugs/1861460
Launchpad user Frank Heimes(fheimes) wrote on 2020-03-20T14:37:44.769731+00:00
Some additional information:
Early in the subiquity installation process (right after disk device enablement) I can see two files in /etc/netplan/:
00-installer-config.yaml
50-cloud-init.yaml.dist-subiquity
I think both are not as they should be for this VLAN environment.
After replacing them with:
network:
version: 2
renderer: networkd
ethernets:
encc000:
dhcp4: no
dhcp6: no
vlans:
encc000.2653:
id: 2653
link: encc000
addresses: [ 10.245.236.15/24 ]
gateway4: 10.245.236.1
nameservers:
search: [ canonical.com ]
addresses:
I was able to bring up the network (in the subiquity shell) using netplan apply. (I also disabled/enabled 0.0.c000 - but I think it was not needed).
Unfortunately there is still no network online after the post-install reboot:
$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group defaul
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: encc000: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group d
efault qlen 1000
link/ether 16:9e:e9:36:c4:90 brd ff:ff:ff:ff:ff:ff
inet6 fe80::149e:e9ff:fe36:c490/64 scope link
valid_lft forever preferred_lft forever
3: encc000.2653@encc000: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueu
e state UP group default qlen 1000
link/ether 16:9e:e9:36:c4:90 brd ff:ff:ff:ff:ff:ff
inet6 fe80::149e:e9ff:fe36:c490/64 scope link
valid_lft forever preferred_lft forever
...since the following netplan yaml is in place - which is not correct:
$ cat /etc/netplan/50-cloud-init.yaml
network:
ethernets:
encc000: {}
version: 2
vlans:
encc000.2653:
id: 2653
link: encc000
nameservers:
addresses:
Replacing it again by the above (known to work) yaml allows to bring the network up again (with the help of netplan).
I add cloud-init as affected package and let the maintainers decide if this is a duplicate or not (see previous comment).
Launchpad user Dan Watkins(oddbloke) wrote on 2020-03-23T16:56:44.005194+00:00
So it looks to me like the network config that cloud-init ends up with is:
{'ethernets': {'encc000': {'match': {'macaddress': 'b2:a0:38:23:63:93'}, 'nameservers': {'addresses': ['10.245.236.1']}, 'set-name': 'encc000.2653'}}, 'version': 2, 'vlans': {'encc000.2653': {'addresses': ['10.245.236.15/24'], 'gateway4': '10.245.236.1', 'id': 2653, 'link': 'encc000', 'nameservers': {'addresses': ['10.245.236.1'], 'search': ['canonical.com']}}}}
which looks incorrect because b2:a0:38:23:63:93 isn't the MAC address of any interface in the system AFAICT. (I also wonder if set-name'ing the encc000 ethernet to the name of the vlan would cause/is causing problems.)
Launchpad user Frank Heimes(fheimes) wrote on 2020-03-23T17:22:03.618691+00:00
Hi Dan, please notice that MAC addresses may change on s390x systems - on such systems they are not that unique as you know from other platforms...
Launchpad user Michael Hudson-Doyle(mwhudson) wrote on 2020-03-25T02:23:07.899930+00:00
I admit to being quite confused, but I think this is probably in some sense a duplicate of the bug Frank linked. Frank, did you configure the networking by putting vlan=$whatever on the kernel command line, or do it entirely in subiquity?
Launchpad user Frank Heimes(fheimes) wrote on 2020-03-25T07:37:09.043938+00:00
Yes, I used vlan= in the parmfile, hence passing it over as argument to the kernel. The entire parmfile (that holds all the kernel parameters) is this:
ip=10.245.236.15::10.245.236.1:255.255.255.0:s1lp15:encc000.2653:none:10.245.236.1 vlan=encc000.2653:encc000 url=ftp://installserver:21/ubuntu-live-server-20.04/focal-live-server-s390x.iso http_proxy=http://proxyserver:3128 --- quiet
(sorry, I should have already attached it)
I think that this is (still) needed to make sure that the installer has (in it's early phase) a working network and is able to download the ISO image. This worked in the past for me, and it obviously still works - at least the ISO can be downloaded.
I'm wondering if this network config (that is obviously working, due to the successful ISO download) can just be accepted by subiquity and the network config screen populated wit it?!
During the installation I can partially find two yaml files in /etc/netplan - one from the installer (I think after I ran across the network dialog) screen and one from cloud-init - that's pretty confusing.
And I am not able to use the subiquity UI's network configuration screen to create a network config (yaml) that is similar to the one in comment #5.
The one from comment #5 shows a configuration that is known to work and that one is comparable to the configuration done in the parmfile.
The cloud-init that I mentioned in comment #3 looks a bit odd to me - but I assume that netplan configs can just be done in different ways, still leading to the same result.
One concern is the use of: match: macaddress: 02:28:0b:00:00:53 I don't know why is that used (and needed ?) - especially having in mind that MAC addresses and not necessarily unique on this platform (s390x). Sticking to the interface name (encc000 respectively encc000.2653 in case of VLAN) would be the preferable option.
Launchpad user Ryan Harper(raharper) wrote on 2020-03-25T17:28:52.690705+00:00
1) The cloud-init task here is duplicate; I'd prefer to drop the task here but I'm not sure what to do bug-wise (can we mark the task only as duplicate?
2) This comment
One concern is the use of: match: macaddress: 02:28:0b:00:00:53 I don't know why is that used (and needed ?) - especially having in mind that MAC addresses and not necessarily unique on this platform (s390x).
s390x is unique in that mac address are not stable; For the rest of the world the MAC is the unique way of identifying what config is associated with a particular interface, moreover, a way to ensure independent of the interface name, that the config is applied from boot to boot.
Please file this issue as a separate cloud-init bug; and in there we can discuss alternatives as well as on which platforms MAC is unstable. ISTR that some s390x did provide stable MAC.
Launchpad user Frank Heimes(fheimes) wrote on 2020-03-26T08:37:31.292245+00:00
I guess there is no option to mark just one (affecting) entry as duplicate, but I'm happy to mark the entire Bug as duplicate, since I found/remembered LP 1861460 pretty late - after I've already opened this one (see comment #4).
A ticket was opened for IBM to get MAC addresses stable/unique (across reboots) on s390x too and indeed the firmware was modified, but only for z14 GA 2 and never system - so there is still some legacy (z14 GA1 and older). Nowadays the interface names are based on their underlying physical device/address (here in this case 'c000'), which makes the interface and it's name already pretty unique - since it is not possible to have two devices (in one system / LPAR) with the exact same address.
Btw. I think with the right tooling you can even change MAC addresses on other platforms, of course the intention was always to have MAC addresses stable and unique - but things changed.
I'll mark this now as duplicate and open a separate cloud-init ticket for further discussions ...
Launchpad user Dimitri John Ledkov(xnox) wrote on 2020-03-26T20:43:39.395593+00:00
1) cloud-init should be fixed for /run/netplan/* which is another bug 2) this bug is about subiquity not deleting/tearing down critical connections
Launchpad user Andrew Cloke(andrew-cloke) wrote on 2020-04-01T12:07:25.449709+00:00
Thanks Dimitri, is that cloud-init bug# 1861460 ?
Launchpad user Dimitri John Ledkov(xnox) wrote on 2020-04-01T16:20:56.914349+00:00
subiquity has a merge proposal to address this issue, I believe. Should be in edge channel soon.
Launchpad user Frank Heimes(fheimes) wrote on 2020-04-01T16:40:11.284601+00:00
That would be fantastic - please let me know if it's in - this will make further testing and usage on LPARs much simpler (and me very happy ;-)
Launchpad user Dimitri John Ledkov(xnox) wrote on 2020-04-01T17:24:19.359335+00:00
well, there are multiple pieces at stake here. I.e. has cloud-init completed successfully on the lpar boot? do you have cloud-init logs from the first boot of the system?
Launchpad user Frank Heimes(fheimes) wrote on 2020-04-01T17:36:41.839117+00:00
The cloud-init log can be found in the attached tgz in comment #2: https://bugs.launchpad.net/subiquity/+bug/1868246/+attachment/5339311/+files/20032020_DASD_LPAR.tgz (it incl. entire /var/log and /var/crash)
Launchpad user Dimitri John Ledkov(xnox) wrote on 2020-04-06T13:58:54.356484+00:00
This is awaiting cloud-init casper initramfs-tools livecd-rootfs pending changes all getting accepted and migrated to release pocket, and new image built using all of the above, before further tests/development can happen.
Launchpad user Dimitri John Ledkov(xnox) wrote on 2020-04-07T12:03:39.156748+00:00
When using today's ISO, with only a single disk drive attached, and snap refreshed from edge channel, one should be able to complete the install with networking not getting interrupted, and correctly have networking in the target too.
Launchpad user Frank Heimes(fheimes) wrote on 2020-04-07T19:29:08.756731+00:00
I just tried it with edge (updated subiquity manually, since I unfortunately need to setup a proxy first for my LPARs to be able to connect to the snap store, but that due to the infrastructure).
The installation worked fine - I had network at the very first subiquity screen and I could just accept the network as it was at the network config screen (just had to select continue), used a single disk (as suggested) and was able to complete the installation - and hit Reboot at the end.
So everything was fine - except one little thing - and that is that the system came up w/o networking. At the console I found this netplan config:
ubuntu@zLin15:~$ ls -la /etc/netplan/ -rw-r--r-- 1 root root 90 Apr 10 19:17 /etc/netplan/00-installer-config.yaml ubuntu@zlin15:~$ cat /etc/netplan/00 "# This is the network config written by 'subiquity'" network: ethernets: {} version: 2
Launchpad user Michael Hudson-Doyle(mwhudson) wrote on 2020-04-08T04:33:01.067699+00:00
Oh well one step at a time. Can you extract the /var/log/installer/subiquity-debug.log file from the installed system and attach it to this bug?
I've promoted current edge to the stable/ubuntu-20.04 channel and so testing tomorrow's ISO should be easier.
Launchpad user Frank Heimes(fheimes) wrote on 2020-04-08T05:58:53.043350+00:00
I left the system in that state yesterday, now just fixed nw and saved /var/log/installer. Please see attached tgz. Launchpad attachments: inst.tgz
Launchpad user Michael Hudson-Doyle(mwhudson) wrote on 2020-04-08T08:04:03.624673+00:00
So the issue here is that initramfs-tools is generating this config:
https://paste.ubuntu.com/p/Cww2XkWB8J/
but this is invalid: the value of a key in ethernets has to have some value. What is happening is that cloud-init is failing to parse it and so nothing at all gets written to any netplan directory.
I think this little patch https://paste.ubuntu.com/p/CzGww7htCQ/ to initramfs-tools should fix the issue, but I'd like to test before uploading.
Launchpad user Dimitri John Ledkov(xnox) wrote on 2020-04-08T20:35:28.738187+00:00
and one more typo
Launchpad user Michael Hudson-Doyle(mwhudson) wrote on 2020-04-13T20:35:08.672382+00:00
So we believe this bug is now fixed but can you confirm Frank?
Launchpad user Frank Heimes(fheimes) wrote on 2020-04-14T13:41:55.085536+00:00
Yes, I can confirm that this is fixed now (using image from Apr 14th) - many thx!
This bug was originally filed in Launchpad as LP: #1868246
Launchpad details
Launchpad user Frank Heimes(fheimes) wrote on 2020-03-20T10:39:28.255787+00:00
I tried today an subiquity LPAR installation using the latest ISO (March 19) that includes the latest 20.03 subiquity. The installation itself completed fine, but after the post-install reboot the system didn't had a network active - please note that the LPAR is connected to a VLAN.
$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group defaul t qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: encc000: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default q len 1000
link/ether a2:8d:91:85:12:e3 brd ff:ff:ff:ff:ff:ff
3: enP1p0s0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 82:0c:2d:0c:b8:70 brd ff:ff:ff:ff:ff:ff
4: enP1p0s0d1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group defaul t qlen 1000
link/ether 82:0c:2d:0c:b8:71 brd ff:ff:ff:ff:ff:ff
5: enP2p0s0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 82:0c:2d:0c:b7:00 brd ff:ff:ff:ff:ff:ff
6: enP2p0s0d1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group defaul t qlen 1000
link/ether 82:0c:2d:0c:b7:01 brd ff:ff:ff:ff:ff:ff
Wanting to have a look at the netplan config it turned out that there is no yaml file: $ ls -l /etc/netplan/
total 0
Adding one manually and applying it worked fine.
So looks like the installer does not properly generate or copy a 01-netcfg.yaml to /etc/netplan.
Please see below the entire steps as well as a compressed file with the entire content of /var/log