Closed cmurf closed 1 week ago
Yes, this is known, it's a QEMU issue in the way the CD-ROM device emulation is done. We can't do much here, other than making the warning silent.
Probably unrelated, but for the record, there is a regression in Linux 5.6-rc1 still not fixed in Linus’ master (5.6-rc3). The patch fixes this.
Yo, hit this today upgrading to 6.6.44 kernel, running udisks2 2.10.1-4 on Arch.
udisksd[5518]: Error probing device: Error sending ATA command IDENTIFY PACKET DEVICE to '/dev/sr0': Unexpected sense data returned: 0000: f0 00 01 00 50 00 02 0a 00 00 00 02 00 1d 00 00 ....P........... 0010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ (g-io-error-quark, 0)
Possibly related to this, unverified: https://lore.kernel.org/stable/20240730151654.600327197@linuxfoundation.org/
Same error here since kernel upgrade to 6.10.2.arch1-2 on Arch.
@bsdice, @sushi2503, sounds like a Linux kernel regression. Please bisect.
@paulmenzel : For me to. Sorry I haven't the knowledge to bisect, hope bsdice has it...
I compiled 6.6.44 with the following three patches reverted (old dog making an educated guess) and that appears to have fixed the error:
ata-libata-scsi-do-not-overwrite-valid-sense-data-when-ck_cond-1.txt ata-libata-scsi-fix-offsets-for-the-fixed-format-sense-data.txt ata-libata-scsi-honor-the-d_sense-bit-for-ck_cond-1-and-no-error.txt
@paulmenzel Enough info to work with? I am not sure if there are interdependencies with the patches so I reverted all three.
Nice. But as you can see this is the UDisks’ project. It’d be great if you could report this to the commit authors and maintainers.
@paulmenzel Thank you so much for reply, I will just cc the author.
Can you say if this is UDisks mis-interpreting kernel's changed sense data, or if kernel is wrongly sending malformed sense data? Because the patch comments claim to have rectified previously wrong data: "Correct the ATA PASS-THROUGH fixed format sense data offsets to conform to SPC-6 and SAT-5 specifications."
Can you say if this is UDisks mis-interpreting kernel's changed sense data, or if kernel is wrongly sending malformed sense data?
Sorry I haven’t looked closely, and do not know. (I am not a UDisk person.) I just know, that it wouldn’t matter as the Linux kernel has a no regression policy.
[…]
Arch linux Kernel 6.10.3-arch1-2 udisks2 2.10.1-4, also receive this message
Aug 09 10:26:46 I-NET udisksd[1038]: Error probing device: Error sending ATA command IDENTIFY DEVICE to '/dev/sda': Unexpected sense data returned: 0000: f0 00 01 00 50 00 01 0a 00 00 00 00 00 1d 00 00 ....P........... 0010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ (g-io-error-quark, 0) Aug 09 10:26:46 I-NET udisksd[1038]: Error probing device: Error sending ATA command IDENTIFY DEVICE to '/dev/sdb': Unexpected sense data returned: 0000: f0 00 01 00 50 00 00 0a 00 00 00 00 00 1d 00 00 ....P........... 0010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ (g-io-error-quark, 0)
Is this a hardware error?
Most likely not a hardware error, but a kernel or UDisks regression. I reverted three libata patches and bug went away. I wrote to patch author and ata subsystem maintainer at kernel.org but no response so far. Might be due to vacation time on northern hemisphere. Possibility 1 is that the kernel patches are wrong, 2 that kernel now emits correct sense data which were previously slightly wrong or format garbled, but UDisks cannot process it now correctly. Don't have time to follow this rabbit hole down, maintainers have to do it.
@bsdice : Maybe it would be better to open a kernel bug report ?
I don't want to do it myself. From what I have seen you sub to -stable and reply to GregKH for the specific patch mail that there is a regression. Patches will be dropped if plausible. This puts pressure on patch writer or subsystem maintainers to clarify or fix it.
So you can probably rule out a kernel bug. I tested the lts-kernel, same problem...
Most importantly is that this get fixed.
So you can probably rule out a kernel bug.
Why? @bsdice wrote that reverting some patches fixed the issue for them.
I tested the lts-kernel, same problem...
Patches are backported to LTS series too. What Linux kernel do you mean exactly? The 6.10.3 from your earlier comment?
@istenburg No you can't rule that out, try a kernel from 3 months ago like 6.6.40. The stable kernel branches all got backported the three patches. That is why you see it now on Arch which is rolling linux package with latest kernel + patches and linux-lts package which is on 6.6 branch + even more stable patches.
Also broken: hddtemp, hdparm, probably more. This will be fun to watch, because the kernel is now acting properly, but applications are now misbehaving. UDisks will need a code update...
Edit: After thinking about it for a bit, best way forward probably is to revert the kernel changes, notify software maintainers of code using that API and give them 3 or 6 or 12 months to fix the use of the API. Then re-introduce the kernel change. hdparm will probably be fixed first, so people just have to copy or emulate the changes.
Yep, I have the same error with my external backup /dev/sda disk but began with my sr0.
Fun, as you say. Some codes are made with the elbows!
Let see the machinery...
After @bsdice has commented on the Arch Linux Forum thread I also found this issue and forwarded it to the LKML thread aswell 😊
This is the message: https://lore.kernel.org/all/e206181e-d9d7-421b-af14-2a70a7f83006@heusel.eu
Sorry I haven’t looked closely, and do not know. (I am not a UDisk person.) I just know, that it wouldn’t matter as the Linux kernel has a no regression policy.
I wouldn't necessarily play on the no-regression-policy note in this case, it's not the same scenario like breaking syscall API. I.e. the following statement from upstream
TL;DR: it is very hard to say that we have introduced a regression, because this crap has basically been broken in one way or another since it was introduced...
may be actually correct.
Well yes and no, especially since it breaks a ton of other tools aswell 😅
This is in a qemu-kvm guest.
qemu is known to have incomplete or incompatible implementation of ATA emulation, both for disk drives and optical drives.
The issue discussed recently is a separate one. Suggesting to continue in #1305 (not sure if github allows to split existing discussions).
Well yes and no, especially since it breaks a ton of other tools aswell 😅
Any reasonable CI would catch this. I agree that this kind of change is better to coordinate with userspace prior to pushing to stable kernel branches, but it happened in the past a couple of times.
UDisks needs to be fixed either way, this particular code has been sitting there for over a decade with no recent developer having actual hands-on experience with ATA commands.
Niklas came up with a patch to bridge kernel correctness and a little ignorant user space software: https://lore.kernel.org/linux-ide/20240812151517.1162241-2-cassel@kernel.org/
Nine days from report to tentative fix, underappreciated agility.
From the commit message in https://lore.kernel.org/linux-ide/20240812151517.1162241-2-cassel@kernel.org/, it seems something in udisks should be changed?
A lot of user space programs incorrectly assume that the sense data is in descriptor format, without checking the RESPONSE CODE field of the returned sense data (to see which format the sense data is in).
Sorry I haven’t looked closely, and do not know. (I am not a UDisk person.) I just know, that it wouldn’t matter as the Linux kernel has a no regression policy.
I wouldn't necessarily play on the no-regression-policy note in this case, it's not the same scenario like breaking syscall API. I.e. the following statement from upstream
TL;DR: it is very hard to say that we have introduced a regression, because this crap has basically been broken in one way or another since it was introduced...
may be actually correct.
That does not matter. Linux never breaks userspace, so people can be sure they can always update the Linux kernel without problems.
6.10.4, 6.10.5, still the same.
Yes, the reverts have just been added to the stable series :) https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/tree/queue-6.10/revert-ata-libata-scsi-honor-the-d_sense-bit-for-ck_cond-1-and-no-error.patch
Nice, soon fixed.
Since kernel 6.10.6 I don't see the errors anymore, sounds like fixed.
6.10.6 works for me, on 6.10.7 it's broken again. I had to boot back into 6.10.6 to mount the drive.
Here on 6.10.7 and 6.10.8 it's working fine.
So I have the affected kernel 6.10.4 here and can reproduce the problem. As a source of inspiration I took a look at udev's id_ata
and to my surprise found out the code was originally written by the same author. Only that udev has been maintained properly over the years, while nobody touched udisksata.c
since 2012.
Important fixes from udev are https://github.com/systemd/systemd/pull/13654 and https://github.com/systemd/systemd/pull/24923.
Anecdotally this concern more or less applies to udisks as well: https://github.com/systemd/systemd/issues/2362#issuecomment-178079214 and https://github.com/systemd/systemd/pull/2363#issuecomment-173274244
Posted #1311 with a quick fix, please test. We don't have variety of hardware to test this on.
Experimenting with my SATA drives and qemu ATA device, it turned out a SG v3 code path is taken. This udev change has effectively taken sense code checking out of the v3 path while udisks still maintains it and fails on qemu devices: https://github.com/systemd/systemd/commit/66026fca95c892059b7dc4f33596f63741519eb8
I don't think that change is right.
udisks2-2.8.4-4.fc32.x86_64 systemd-udev-245~rc1-2.fc32.x86_64 both: kernel-5.6.0-0.rc2.git0.1.fc32.x86_64 kernel-5.5.3-200.fc31.x86_64
[ 6.313699] vm.local udisksd[768]: Error probing device: Error sending ATA command IDENTIFY PACKET DEVICE to '/dev/sr0': Unexpected sense data returned: 0000: 70 00 05 00 00 00 00 0a 00 58 00 01 21 04 00 00 p........X..!... 0010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ (g-io-error-quark, 0)
This is in a qemu-kvm guest.