Closed prakashsurya closed 6 years ago
Last week I suspected this issue might have been due to zfsonlinux/zfs@d4a72f23863382bdf6d0ae33196f5b5decbc48fd (I no longer suspect this), so I also kicked off a long running zloop
process on another system with that commit reverted (but still based off of zfsonlinux/zfs@94183a9d8a1133ff0d29666a86f84c24f2c4083c). With the sequential scrub code reverted, I was still able to trigger a hang.
$ date
Mon Nov 27 22:57:57 UTC 2017
$ pgrep ztest | xargs ps -f -p
UID PID PPID C STIME TTY STAT TIME CMD
root 8516 27305 0 Nov23 pts/1 Sl+ 2:40 /export/home/delphix/zfs/cmd/ztest/.libs/ztest
root 27305 2063 0 Nov23 pts/1 S+ 0:03 /export/home/delphix/zfs/cmd/ztest/.libs/ztest -VVVVV -m 2 -r 0 -R 1 -v 2 -a 9 -T 92 -P 39 -s 512m -f /var/tmp/zloop/zloop-run
$ pstree -p 27305 | head
ztest(27305)---ztest(8516)-+-{ztest}(8517)
|-{ztest}(8518)
|-{ztest}(8519)
|-{ztest}(8520)
|-{ztest}(8521)
|-{ztest}(8522)
|-{ztest}(8523)
|-{ztest}(8524)
|-{ztest}(8525)
|-{ztest}(8526)
Again, I used gcore
and gdb
to dump the stacks of all threads. Here's the corefile (compressed with zip
), core.8516.zip, and here's a file containing all stacks, gdb.txt.
In this example, there's also a single thread in zio_wait
, only this time it's the sync thread:
Thread 565 (Thread 0x7f4ea06d5700 (LWP 9525)):
#0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1 0x00007f4eaca99e34 in cv_wait (cv=cv@entry=0x7f4df004a3f0, mp=mp@entry=0x7f4df004a3c0) at kernel.c:328
#2 0x00007f4eacbd523b in zio_wait (zio=zio@entry=0x7f4df004a010) at ../../module/zfs/zio.c:1889
#3 0x00007f4eacb0a36c in dsl_pool_sync (dp=dp@entry=0x5617ae500b30, txg=txg@entry=368) at ../../module/zfs/dsl_pool.c:558
#4 0x00007f4eacb335e6 in spa_sync (spa=spa@entry=0x5617ae4fad70, txg=txg@entry=368) at ../../module/zfs/spa.c:6868
#5 0x00007f4eacb454e7 in txg_sync_thread (arg=0x5617ae500b30) at ../../module/zfs/txg.c:547
#6 0x00007f4eac5386da in start_thread (arg=0x7f4ea06d5700) at pthread_create.c:456
#7 0x00007f4eac272d7f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:105
With that said, the stacks in this corefile look more interesting than the other one (e.g. one thread in spa_open_common
, another in spa_keystore_dsl_key_rele
, another in arc_read
via dsl_dir_get_encryption_root_ddobj
, etc.), so these could be two separate issues, both resulting in a ztest
"hang". For now, I'm lumping both into the same issue; we can split them up later, if they're determined to be different.
Lastly, I also tried running zloop
against my code for #6566 (also based on zfsonlinux/zfs@94183a9d8a1133ff0d29666a86f84c24f2c4083c) and also hit a ztest
hang:
$ date
Mon Nov 27 23:24:24 UTC 2017
$ pgrep ztest | xargs ps -f -p
UID PID PPID C STIME TTY STAT TIME CMD
root 14427 26148 0 Nov24 pts/1 S+ 0:01 /export/home/delphix/zfs/cmd/ztest/.libs/ztest -VVVVV -m 2 -r 0 -R 1 -v 2 -a 12 -T 66 -P 19 -s 512m -f /var/tmp/zloop/zloop-run
root 22145 14427 0 Nov24 pts/1 Sl+ 2:23 /export/home/delphix/zfs/cmd/ztest/.libs/ztest
$ pstree -p 14427 | head
ztest(14427)---ztest(22145)-+-{ztest}(22146)
|-{ztest}(22147)
|-{ztest}(22148)
|-{ztest}(22149)
|-{ztest}(22150)
|-{ztest}(22151)
|-{ztest}(22152)
|-{ztest}(22153)
|-{ztest}(22154)
|-{ztest}(22155)
In this case, there's a lot of threads waiting in spa_open_common
(probably waiting for spa_namespace_lock
) and the sync thread has the following stack:
Thread 565 (Thread 0x7f30350e1700 (LWP 23159)):
#0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1 0x00007f304162c724 in cv_wait (cv=cv@entry=0x55aba21ef6d0, mp=mp@entry=0x55aba21ef690) at kernel.c:328
#2 0x00007f30416a286b in scan_exec_io (dp=dp@entry=0x55aba225f1c0, bp=bp@entry=0x7f30350ddbb0, zio_flags=25166000, zio_flags@entry=32560, zb=0x7f30350ddb90, zb@entry=0x18000b0350ddb90, queue=queue@entry=0x0)
at ../../module/zfs/dsl_scan.c:3566
#3 0x00007f30416a4dfd in dsl_scan_enqueue (dp=dp@entry=0x55aba225f1c0, bp=bp@entry=0x7f30350ddbb0, zio_flags=32560, zio_flags@entry=25166000, zb=0x18000b0350ddb90, zb@entry=0x7f30350ddb90) at ../../module/zfs/dsl_scan.c:3436
#4 0x00007f30416a50f2 in dsl_scan_scrub_cb (dp=0x55aba225f1c0, bp=0x7f30350ddbb0, zb=0x7f30350ddb90) at ../../module/zfs/dsl_scan.c:3509
#5 0x00007f30416a82e1 in dsl_scan_ddt_entry (scn=0x55aba22315c0, checksum=<optimized out>, dde=dde@entry=0x7f2f4400fe10, tx=tx@entry=0x9f82231a8ac43995) at ../../module/zfs/dsl_scan.c:2299
#6 0x00007f304165bada in ddt_sync_entry (ddt=ddt@entry=0x55aba22e9710, dde=dde@entry=0x7f2f4400fe10, tx=0x9f82231a8ac43995, tx@entry=0x7f3030010010, txg=15078613438550168504, txg@entry=190) at ../../module/zfs/ddt.c:1122
#7 0x00007f304165bd19 in ddt_sync_table (ddt=ddt@entry=0x55aba22e9710, tx=tx@entry=0x7f3030010010, txg=txg@entry=190) at ../../module/zfs/ddt.c:1147
#8 0x00007f304165ca16 in ddt_sync (spa=spa@entry=0x55aba21eed20, txg=txg@entry=190) at ../../module/zfs/ddt.c:1199
#9 0x00007f30416ca307 in spa_sync (spa=spa@entry=0x55aba21eed20, txg=txg@entry=190) at ../../module/zfs/spa.c:6882
#10 0x00007f30416dc1f7 in txg_sync_thread (arg=0x55aba225f1c0) at ../../module/zfs/txg.c:547
#11 0x00007f30410ca6da in start_thread (arg=0x7f30350e1700) at pthread_create.c:456
#12 0x00007f3040e04d7f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:105
Again, I used gcore
and gdb
to dump the stacks of all threads. Here's the corefile (compressed with zip
), core.22145.zip, and here's a file containing all stacks, gdb.txt.
I took a look at the gdb.txt files. The one from Mon Nov 27 22:57:57 UTC 2017
was particularly interesting to me because it was stuck in the encryption code. The interesting thing here is that this one doesn't seem to be caused by the same problem as the others. This one seems to be the result of this stack trace:
Thread 588 (Thread 0x7f4e863d0700 (LWP 10220)):
#0 __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1 0x00007f4eac53af0d in __GI___pthread_mutex_lock (mutex=mutex@entry=0x7f4eacf31650 <buf_hash_table+166000>) at ../nptl/pthread_mutex_lock.c:80
#2 0x00007f4eaca9991c in mutex_enter (mp=mp@entry=0x7f4eacf31650 <buf_hash_table+166000>) at kernel.c:207
#3 0x00007f4eacaa6b16 in buf_hash_find (spa=spa@entry=1711801064303451485, bp=bp@entry=0x7f4dd0003240, lockp=lockp@entry=0x7f4e863cc548) at ../../module/zfs/arc.c:1086
#4 0x00007f4eacab476a in arc_read (pio=pio@entry=0x7f4e7c00a610, spa=0x5617ae4fad70, bp=0x7f4dd0003240, done=done@entry=0x7f4eacac0750 <dbuf_read_done>, private=private@entry=0x7f4e7c0029b0, priority=priority@entry=ZIO_PRIORITY_SYNC_READ,
zio_flags=128, arc_flags=0x7f4e863cc5dc, zb=0x7f4e863cc5e0) at ../../module/zfs/arc.c:5881
#5 0x00007f4eacabf376 in dbuf_read_impl (db=db@entry=0x7f4e7c0029b0, zio=zio@entry=0x7f4e7c00a610, flags=flags@entry=10) at ../../module/zfs/dbuf.c:1134
#6 0x00007f4eacabfa73 in dbuf_read (db=db@entry=0x7f4e7c0029b0, zio=0x7f4e7c00a610, zio@entry=0x0, flags=flags@entry=10) at ../../module/zfs/dbuf.c:1276
#7 0x00007f4eacacbd40 in dmu_buf_hold (os=<optimized out>, object=<optimized out>, offset=offset@entry=0, tag=tag@entry=0x7f4eacc4c110 <__func__.14426>, dbp=dbp@entry=0x7f4e863cc6f0, flags=flags@entry=1) at ../../module/zfs/dmu.c:233
#8 0x00007f4eacbc1e08 in zap_lockdir (os=<optimized out>, obj=<optimized out>, tx=tx@entry=0x0, lti=lti@entry=0, fatreader=fatreader@entry=B_TRUE, adding=adding@entry=B_FALSE, tag=0x7f4eacc4c110 <__func__.14426>, zapp=0x7f4e863cc760)
at ../../module/zfs/zap_micro.c:586
#9 0x00007f4eacbc26e5 in zap_lookup_norm (os=<optimized out>, zapobj=<optimized out>, name=name@entry=0x7f4eacc32615 "DSL_CRYPTO_ROOT_DDOBJ", integer_size=integer_size@entry=8, num_integers=num_integers@entry=1,
buf=buf@entry=0x7f4e863cc800, mt=(unknown: 0), realname=0x0, rn_len=0, ncp=0x0) at ../../module/zfs/zap_micro.c:956
#10 0x00007f4eacbc2780 in zap_lookup (os=<optimized out>, zapobj=<optimized out>, name=name@entry=0x7f4eacc32615 "DSL_CRYPTO_ROOT_DDOBJ", integer_size=integer_size@entry=8, num_integers=num_integers@entry=1, buf=buf@entry=0x7f4e863cc800)
at ../../module/zfs/zap_micro.c:901
#11 0x00007f4eacb03ffd in dsl_dir_get_encryption_root_ddobj (dd=dd@entry=0x7f4e7c0025b0, rddobj=rddobj@entry=0x7f4e863cc800) at ../../module/zfs/dsl_crypt.c:356
#12 0x00007f4eacb040ae in spa_keystore_wkey_hold_dd (spa=spa@entry=0x5617ae4fad70, dd=dd@entry=0x7f4e7c0025b0, tag=tag@entry=0x7f4eacc33ed0 <__func__.16399>, wkey_out=wkey_out@entry=0x7f4e863cc860) at ../../module/zfs/dsl_crypt.c:406
#13 0x00007f4eacb041c2 in spa_keystore_dsl_key_hold_dd (spa=spa@entry=0x5617ae4fad70, dd=dd@entry=0x7f4e7c0025b0, tag=tag@entry=0x7f4e7c002d60, dck_out=dck_out@entry=0x7f4e7c002e08) at ../../module/zfs/dsl_crypt.c:642
#14 0x00007f4eacb0632c in spa_keystore_create_mapping_impl (spa=0x5617ae4fad70, dsobj=174, dd=0x7f4e7c0025b0, tag=tag@entry=0x7f4e7c000fd0) at ../../module/zfs/dsl_crypt.c:917
#15 0x00007f4eacb06459 in spa_keystore_create_mapping (spa=<optimized out>, ds=ds@entry=0x7f4e7c000fd0, tag=tag@entry=0x7f4e7c000fd0) at ../../module/zfs/dsl_crypt.c:965
#16 0x00007f4eacaf0a92 in dsl_dataset_hold_obj_flags (dp=dp@entry=0x5617ae500b30, dsobj=<optimized out>, flags=flags@entry=DS_HOLD_FLAG_DECRYPT, tag=tag@entry=0x5617ad6bef00 <__func__.21632>, dsp=dsp@entry=0x7f4e863cca28)
at ../../module/zfs/dsl_dataset.c:552
#17 0x00007f4eacaf14f2 in dsl_dataset_hold_flags (dp=0x5617ae500b30, name=name@entry=0x7f4e863cccb0 "ztest/temp_20", flags=flags@entry=DS_HOLD_FLAG_DECRYPT, tag=tag@entry=0x5617ad6bef00 <__func__.21632>, dsp=dsp@entry=0x7f4e863ccad0)
at ../../module/zfs/dsl_dataset.c:586
#18 0x00007f4eacaf3207 in dsl_dataset_own (dp=<optimized out>, name=name@entry=0x7f4e863cccb0 "ztest/temp_20", flags=flags@entry=DS_HOLD_FLAG_DECRYPT, tag=tag@entry=0x5617ad6bef00 <__func__.21632>, dsp=dsp@entry=0x7f4e863ccad0)
at ../../module/zfs/dsl_dataset.c:649
#19 0x00007f4eacad5e47 in dmu_objset_own (name=0x7f4e863cccb0 "ztest/temp_20", type=DMU_OST_OTHER, readonly=B_FALSE, decrypt=B_TRUE, tag=0x5617ad6bef00 <__func__.21632>, osp=0x7f4e863ccca0) at ../../module/zfs/dmu_objset.c:698
#20 0x00005617ad6aeb4e in ztest_dmu_objset_own (name=0x7f4e863cccb0 "ztest/temp_20", type=DMU_OST_OTHER, readonly=B_FALSE, tag=0x5617ad6bef00 <__func__.21632>, osp=0x7f4e863ccca0, decrypt=B_TRUE) at ztest.c:1193
#21 0x00005617ad6b8481 in ztest_dmu_objset_create_destroy (zd=<optimized out>, id=20) at ztest.c:3740
#22 0x00005617ad6ac77c in ztest_execute (test=<optimized out>, zi=0x5617ad8c4280 <ztest_info+320>, id=20) at ztest.c:6237
#23 0x00005617ad6b2e03 in ztest_thread (arg=0x14) at ztest.c:6284
#24 0x00007f4eac5386da in start_thread (arg=0x7f4e863d0700) at pthread_create.c:456
#25 0x00007f4eac272d7f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:105
This seems to indicate that the arc header lock was leaked.....
The other problems seem to be to be caused by the spa_namespace_lock
(as @prakashsurya already mentioned).
Is there a chance this is caused by #7017?
All three ztest
failures here have been addressed in master by the following patches.
f90a30a Fix txg_sync_thread hang in scan_exec_io() 0d23f5e Fix hash_lock / keystore.sk_dk_lock lock inversion 0735ecb OpenZFS 8997 - ztest assertion failure in zil_lwb_write_issue 2fe61a7 OpenZFS 8909 - 8585 can cause a use-after-free kernel panic
In addition, the deadman code was enhanced to more reliably detect hangs, 8fb1ede146426e2217b35a254b4660e264854fa1, and zloop.sh
was better integrated with the buildbot cd0a89ded96d9cedbf20c70b533b616770ca4831 to log the useful information from the core dump for analysis.
System information
Describe the problem you're observing
When running
zloop
indefinitely, I'm seeing cases whereztest
appears to hang. The first sign of trouble, iszloop
will stop outputting new lines; e.g.:In this example,
zloop
ran for nearly 12 hours, kicking off hundreds ofztest
invocations in the process (estimating, I didn't count), until this lastztest
process hangs (it's been in this state for several days).There's a single
ztest
process running on the system right now:I then used
gcore
to get a core dump of the process:And then
gdb
to inspect the corefile, to see what's going on:The "main"
ztest
thread appears to be waiting for all other threads to finish:Using
thread apply all where
ingdb
to inspect the stacks of all threads, most stacks look benign, but I see one inzio_wait
:I'm attaching a file containing the stacks of all threads from the corefile, gdb.txt, as well as the corefile itself (compressed with
zip
), core.17724.zip.