Open rincebrain opened 3 years ago
I recent read through the source code. I found tx->tx_open_txg is often accessed without sufficient locks, see txg_wait_synced_impl(), which is fine in 64bit system. In 32bit system, you may get partially updated value. It may cause deadlock you experienced.
This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.
I'm having the exact same issue especially when trying to compile Linux to bootstrap a Gentoo system. Unless I do some seemingly unrelated activity like some ls or find, kernel compilation will halt seemingly indefinitely ( at least 5-6h ) but as soon as I mess around on the filesystem on a different tty it continues as normal. Tested against 2.2.3 and 2.2.4 with the same results I use a tiny intel atom Sony Vaio VGN-P25G
System information
Describe the problem you're observing
Somewhat randomly, during ZTS runs, all ZFS IO will grind to a halt - iostat reports nothing going on, CPU is over 99% idle, but some ZFS or ZFS-accessing task is in iowait. Usually they eventually continue, but sometimes I've let them wait 30-60 minutes without any progress.
mkfile
seems especially good at triggering this.Describe how to reproduce the problem
Run the "sanity" runfile for ZTS on i386 (probably with the test cases mentioned in #12029 removed, because nobody likes a BUG_ON) - I've yet to get one to run to completion without this happening.
Include any warning/errors/backtraces from the system logs
Sometimes there will be one or more log messages like the following:
echo 'w' | sudo tee /proc/sysrq-trigger
while this was happening produced: