Open sharkcz opened 2 years ago
hmm, now I understand it even less, booting with kernel 5.14.18 with rd.udev.debug
the journal is full of LINK
messages for the by-id
symlinks from the 59-dasd
rules file, but nothing is there, not even the /dev/disk/by-id
directory ...
I wonder if there is a race condition between creating the actual symlinks and creating the /dev/disk/by-id/
directory ...
I think messages like this explain the missing symlinks
...
Dec 10 10:48:30 openshift-8.s390.bos.redhat.com systemd-udevd[420]: dasdd: No reference left for '/dev/disk/by-id/ccw-IBM.750000000FRB71.0233.22.00000000000027200000000000000000', removing
Dec 10 10:48:30 openshift-8.s390.bos.redhat.com systemd-udevd[420]: dasdd: Updating old device symlink '/dev/disk/by-id/ccw-0X5722', which is no longer belonging to this device.
Dec 10 10:48:30 openshift-8.s390.bos.redhat.com systemd-udevd[420]: dasdd: No reference left for '/dev/disk/by-id/ccw-0X5722', removing
Dec 10 10:48:30 openshift-8.s390.bos.redhat.com systemd-udevd[420]: dasdd: Updating old device symlink '/dev/disk/by-id/ccw-IBM.750000000FRB71.0233.22', which is no longer belonging to this device.
Dec 10 10:48:30 openshift-8.s390.bos.redhat.com systemd-udevd[420]: dasdd: No reference left for '/dev/disk/by-id/ccw-IBM.750000000FRB71.0233.22', removing
Dec 10 10:48:30 openshift-8.s390.bos.redhat.com systemd-udevd[420]: dasdd: Updating old device symlink '/dev/disk/by-id/ccw-IBM.750000000FRB71.0233.22.00000000000027200000000000000000', which is no longer belonging to this device.
Dec 10 10:48:30 openshift-8.s390.bos.redhat.com systemd-udevd[420]: dasdd: No reference left for '/dev/disk/by-id/ccw-IBM.750000000FRB71.0233.22.00000000000027200000000000000000', removing
Dec 10 10:48:30 openshift-8.s390.bos.redhat.com systemd-udevd[420]: dasdd: Updating old device symlink '/dev/disk/by-id/ccw-0X5722', which is no longer belonging to this device.
Dec 10 10:48:30 openshift-8.s390.bos.redhat.com systemd-udevd[420]: dasdd: No reference left for '/dev/disk/by-id/ccw-0X5722', removing
Dec 10 10:48:30 openshift-8.s390.bos.redhat.com systemd-udevd[420]: dasdd: Updating old device symlink '/dev/disk/by-id/ccw-IBM.750000000FRB71.0233.22', which is no longer belonging to this device.
Dec 10 10:48:30 openshift-8.s390.bos.redhat.com systemd-udevd[420]: dasdd: No reference left for '/dev/disk/by-id/ccw-IBM.750000000FRB71.0233.22', removing
ping me for a full log
I'm currently unable to access the RedHat BZ. I need to get an account first.
Can you share more details about your setup? How are the DASDs configured? Did you use chzdev -e
to have a persistent
configuration? Is it always the same DASDs that have this issue or is it random?
I've tried several reboots on an LPAR with 10 DASDs persistently configured using chzdev -e
on a freshly installed
F35 (tried both 5.15.6-200.fc35.s390x
and 5.14.10-300.fc35.s390x
+ s390utils-2:2.17.0-2.fc35.s390x
) but
wasn't able to reproduce the issue so far.
my environment is
The original report is from OCP/RHEL-8.x with z/VM 7.2.0 on z13 and z15.
I suspect there might be something wrong with udev or kernel handling the devices, rather than the udev rules in s390utils which are pretty straightforward.
20220105-1028-udev.log.zip created with
journalctl -b | grep systemd-udev > 20220105-1028-udev.log
ll /dev/disk/by-id/ >> 20220105-1028-udev.log
lsdasd
I was able to reproduce the issue myself now with:
I didn't see the problem when using chzdev to persistently configuring the devices. Maybe you can give it a try
to see how this behaves on your setup. Make sure to remove all DASD from /etc/dasd.conf and then enable them
via chzdev -e <devices>
(you can specify a range here as well, e.g. 9300-930f).
I'll dig a bit deeper to see where the problem might be.
Have you also removed the rd.dasd=
definitions from the kernel parameter line and used the zdev "rootfs mode" purely?
Right now I am testing with /etc/dasd.conf
completely removed (both from system and from initrd) and still no 100% success (4x all links created, 1x no links at all, 1x links for dasdc1
only).
I believe using the zdev persistent config doesn't matter. I have converted my system fully to zdev for dasds and I am still getting random result with the "by-id" links. I suspect the problem is deeper in udev or kernel.
@sharkcz it's been a while and I haven't been able to come around looking deeper into this. Is this still reproducible?
hi @hoeppnerj , I believe it still does happen. I have tried a fresh F-40 installation on a z/VM guest (with a single DASD) and after the first boot there was no /dev/disk/by-id/
directory at all. And I have thought the symlinks will be there in subsequent boot(s), because udev debugging says dasda: /usr/lib/udev/rules.d/59-dasd.rules:12 Added SYMLINK 'disk/by-id/ccw-0X0120'
in journal, but the symlink is still missing. Even more weird ...
I have checked another z/VM systems with multiple DASDs (F-39 with kernel 6.7) and they both have entries /dev/disk/by-id/
, but they are not complete if I see right. Although this is likely caused by a non-unique ID_UID
returned by dasdinfo
(ID_XUID
is unique) ...
And after another series of reboots on a F-39 system (with 1 DASD) I would say the by-id
links are created reliably. So I suspect we might have a new F-40 (and likely RHEL-10) issue not creating the symlinks at all. And perhaps the original issue went away at some point before F-39 ...
Alright, thanks for the update. I'll try to have a look again. As it seems it must be something to do with 59-dasd.rules
and dasdinfo
. Maybe the tool isn't getting the information in time.
I would say please try F-39 on a slightly bigger VMs than my single dasd one to either confirm or refute my findings. Similar with F-40. Right now I suspect a change in systemd/udev between v254 in F-39 and v255 in F-40. Being able to run the udev-worker
process under strace
could reveal something. I have even tried with SELinux disabled on the F-40 to rule out a too strict SELinux policy :-)
We are experiencing a situation where the
/dev/disk/by-id/...
symlinks are inconsistent across reboots. Sometimes links for all disks/dasds are present, sometimes only a (different) subset is present.environment is Fedora 35 with kernel-5.14.18-300.fc35.s390x and s390utils-core-2.17.0-2.fc35.s390x (version shouldn't matter much as
etc/udev/rules.d/59-dasd.rules
hasn't changed for long time, except the scheduler setting)Fedora 35 with kernel-5.15.6-200.fc35.s390x doesn't seem to have the
/dev/disk.by-id
directory at all, looking further ...Related: https://bugzilla.redhat.com/show_bug.cgi?id=1963192