Open yshui opened 2 years ago
Hmm, there is another problem. I worked around this issue (by disabling ZSWAP), but then the resultant kernel does not boot on real hardware, it boots in qemu, however.
Chiming in as I've also been able to reproduce this exact issue earlier, even with an LLVM toolchain built from Git sources that are only a business week old (some of those TCs from that time had other problems like miscompiling sha512_ssse3
, but not really relevant here, & also already fixed in newer revisions).
This seems to no longer be reproducible. I've built a kernel with CONFIG_ZFS=y
, CONFIG_JUMP_LABEL=y
, & CONFIG_ZSWAP=y
(let me know if you would like the exact dotconfig) with fairly fresh LLVM & Clang main, specifically bfc812a2f32698ef383d486c25fa6abc001d6466
, with both full & thin LTO (I've been able to reproduce the same issue with full LTO as well before) & the kernel boots just fine on QEMU, Cloud Hypervisor, & several pieces of real HW (a smorgasbord of x86_64 stuff).
I'm not really sure which commit fixed it, but at least the commit I used is no longer affected by this bug, at least from my (admittedly somewhat minimal) testing.
Versions of other things:
c3eb11fbb826879be773c137f281569efce67aa8
b0657a59abb38659721bf8d973920292c4f4a1a8
@0n-s thanks, i will try to repro again later.
Hi. I seem to be hitting this same issue or an extremely similar one, with all but the ZFS being the same, with another out-of-tree CoW fs, Bcachefs (testing branch).
It is built-in as well, with ZSWAP=y
and JUMP_LABEL=y
, on 6.6 git master.
I will try using Clang+LLVM 18 main to see if its still reproducible there.
The main problem for me is that it does not produce any kernel output on boot, but compiling with CFI and LTO off reveals very similar kernel errors to the ones listed above, specially on non-clean unmounts.
FYI, there are parts of ZFS and Bcachefs that overlap, to the extent of having this many issues with the keyword "bcachefs" (wild guess: might this have something to do with kmem_cache_alloc
?) (https://github.com/openzfs/zfs/pull/15143)
Kernel commit: b9ddbb0cde2adcedda26045cc58f31316a492215
LLVM/Clang version: 17.0.2 stable
Bcachefs commit: e1aae900a671cad3ed51c252a0dda0c7e8a89362
OS: Chimera Linux rolling
I reported this in #1440, but after further investigation this looks like a different issue.
The symptom is the same as #1440:
However, this only happens, if I enable the ZFS module as kernel builtin. (Of course, disabling CONFIG_ZSWAP helps, since it worksaround that jump label).
Interesting how ZFS can trigger an codegen change in code that is seemingly completely unrelated, through LTO.
This is probably related to openzfs/zfs#13549