Closed daduke closed 8 months ago
I only started encountering this in 6.8, but 6.7 worked just fine for me. Despite the different versions that triggered it though, the error message I got is otherwise identical to yours (barring the exact devices and journal entry involved).
I managed to track down the error to journal_io.c:journal_write_endio, with the operation not supported
error code coming from bio. The strangest part to me is that it appears for all three of my drives despite them being very different from each other (one NVME, one SATA SSD, and one HDD), so if it was a hardware capability problem, I'd expect at least one of said drives to be able to handle it. With that in mind, it's probably more likely something from bio itself.
Edit: After doing some digging, it seems like a likely candidate for the cause was actually already addressed in this commit. I'll see if I end up compiling the kernel from a version past that to see if it fixes the issue. If not, guess we wait for 6.8-rc2.
Cool, I was about to post this issue, I didn't realize it had already cropped up. Adding my log:
Jan 27 22:25:55 mrgency kernel: bcachefs (sda1): error writing journal entry 1090234: operation not supported
Jan 27 22:25:55 mrgency kernel: bcachefs (sdb1): error writing journal entry 1090234: operation not supported
Jan 27 22:25:55 mrgency kernel: bcachefs (b546dee3-ba04-4def-b057-b12f7c9e2e82): unable to write journal to sufficient devices
Jan 27 22:25:55 mrgency kernel: bcachefs (b546dee3-ba04-4def-b057-b12f7c9e2e82): fatal error - emergency read only
Never happened for me on 6.7 so far, or with 6.7 with patches from master before 6.8 was merged into it. This only started when I attempted to use a mostly vanilla 6.8-rc1 kernel without any bcachefs related patches or commits applied out of tree.
This should be fixed now in Linus's tree
Should be in 6.8-rc2, which was just tagged about 35 minutes ago, but is a while from hitting kernel.org main page. Does it need to be backported to 6.7 as well?
I can confirm that the issue has been resolved. Thanks!
hey there,
we've been playing around with bcachefs for over a year as a possible future candidate for our multi-PB storage setup. We regularly compile upstream kernels and test tiered (HDD hardware RAID + SSD cache) file system configurations. Starting right around the 6.7 release, we noticed that the first write to such a file system causes an error and the FS goes r/o:
this is on a Debian Bookworm system with upstream kernel and latest bcachefs-tools. I stripped down the mkfs and the minimal failing configuration is
while
works fine. I haven't found any similar bug description (neither GH issues nor mailing list), so it might well be something particular to our machine.
Any help would be greatly appreciated.
thanks, -Christian