Closed mabod closed 4 years ago
@mabod: pretty sure i've seen this, or a very close-to-this stack trace a few times on BFQ, without the bfq-sq changes now upstream (stuck on 4.9). Have you tried #6568 by chance? Curious if it crops up there too
I did not test #6568. But I could test it if (1) somebody provides me with zfs/spl module binaries for Manjaro and (2) this somebody promises me that this will not break by pool and the data.
You should REALLY not be installing binaries into your kernel from random sources :-). One of those infosec "no-nos." Arch (or your version of it) has a very simple package builder you can utilize to quickly rebuild ZFS for your kernel and patch it on the fly in the build process. Pkgbuilds are well doc'd. Also, your ZFS data may already be corrupt for all you know, and nobody will guarantee you anything - there's no financial exchange going on here to imply any level of support or liability (see the old spacemap issues for an example of how corruption can happen without you knowing or the recent crypto fixes).
I understand that. I was not 100% serious with my comment. I do not feel comfortable compiling zfs+patch by myself and test it. I strongly hope that this isssue can be debugged and solved by other means.
This issue is obsolete. Closing it.
System information
Distribution Name | Manjaro Distribution Version | rolling Linux Kernel | 4.13.11 (not happening with 4.9.60) Architecture | amd64 ZFS Version | 0.7.2-1 SPL Version | 0.7.2-1
Describe the problem you're observing
When kernel is 4.13.11 and zfs_prefetch_disable=0 the zfs module crashes with core dump in the journal when switching the scheduler to bfq-sq.
Describe how to reproduce the problem
I am doing performance tests for different schedulers. I am using the following script to switch the schedulers and exceute fio for the benchmark:
When the script reaches scheduler bfq-sq the fio process hangs. It sits for about a minute before the following messages show up in the journal:
This only happens with kernel 4.13.11 and zfs_prefetch_disable=0. This is reproducible when the script switches to scheduler bfq-sq. No issue with the other schedulers.
With kernel 4.9.60 it does not happen regardless of zfs_prefetch_disable. By the way, for kernel 4.9.60 I have to use bfq instead of bfq-sq.