openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.55k stars 1.74k forks source link

Panic during `zfs send pool1/dataset1 | zfs recv -v -u -s pool2/dataset2`. #13937

Open 512yottabytes opened 2 years ago

512yottabytes commented 2 years ago

Hello, I've permanent panic during zfs send pool1/dataset1 | zfs recv -v -u -s pool2/dataset2. It appears just after 179 GB of 183 GB is transfered. It happens on Ubuntu, Fedora, FreeBSD 13.1, OpenIndiana Hipster (OI-hipster-gui-20211031) and OmniOS (omnios-r151042).

System information

Distribution Name = Ubuntu, Fedora, FreeBSD 13.1, OpenIndiana Hipster (OI-hipster-gui-20211031) and OmniOS (omnios-r151042) Distribution Version = Ubuntu 22.04, Fedora 36, FreeBSD 13.1, OpenIndiana Hipster (OI-hipster-gui-20211031) and OmniOS (omnios-r151042) Kernel Version = Linux hpws 5.15.0-48-generic #54-Ubuntu SMP Fri Aug 26 13:26:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux (as for Ubuntu 22.04) Architecture x86_64 OpenZFS Version= zfs-2.1.4-0ubuntu0.1, zfs-kmod-2.1.4-0ubuntu0.1 (as for Ubuntu 22.04)

Describe the problem you're observing:

I've permanent panic during zfs send pool1/dataset1 | zfs recv -v -u -s pool2/dataset2. It appears just after 179 GB of 183 GB is transfered. It also occurs not depending on if -v and/or -s are present or not ( zfs send pool1/dataset1 | zfs recv -u pool2/dataset2, and so on...)

Source dataset is encrypted with ZFS native encryption (aes 256 gcm). Compression = lz4 Dedup = off Checksum = sha512

Target dataset: Compression = lz4 Dedup = off Checksum = on

Target dataset can be not encrypted or be on top of LUKS. Also tried to send via SSH, the panic occurs on sender's side. Despite the error occurs on the sender size, zfs send pool1/dataset1 > /home/user/file.zfs works without any error.

UPD 1:

Just tried to cat /home/user/file.zfs | zfs recv -v -u -s pool2/dataset2, also got panic.

UPD 2:

Also have errors in rsync and same panic trying to:

zfs mount pool1/dataset1 # (to /mnt/1)
zfs mount pool2/dataset2 # (to /mnt/2)
rsync -avpxHES /mnt/1/. /mnt/2/.

zpool scrub pool1 and zpool scrub pool2 finished successfully without any error.

UPD 3:

Sending dataset's snapshot as unencrypted raw via zfs send -w pool1/dataset1@--2022-13-32--25-62-62--snapshot1 | zfs recv pool2/dataset2 works well, without errors. Sending snapshots 2 & 3 zfs send -w pool1/dataset1@--2022-14-33--26-63-63--snapshot2 | zfs recv pool2/dataset3 gives errors in zfs and spl modules, but send/recv ends successfully.

Describe how to reproduce the problem:

zpool import pool1
zpool import pool2
zfs load-key pool1/dataset1
zfs send pool1/dataset1 | zfs recv -v -u -s pool2/dataset2

Include any warning/errors/backtraces from the system logs

[ 3976.842078] Process accounting resumed
[ 5503.994386] perf: interrupt took too long (7170 > 7148), lowering kernel.perf_event_max_sample_rate to 27750
[26340.920317] VERIFY3(sa.sa_magic == SA_MAGIC) failed (3301308930 == 3100762)
[26340.920324] PANIC at zfs_quota.c:89:zpl_get_file_info()
[26340.920327] Showing stack for process 122168
[26340.920329] CPU: 5 PID: 122168 Comm: dp_sync_taskq Tainted: P          IOE     5.15.0-48-generic #54-Ubuntu
[26340.920333] Hardware name: Hewlett-Packard HP Z600 Workstation/0AE8h, BIOS 786G4 v03.60 02/24/2016
[26340.920335] Call Trace:
[26340.920338]  <TASK>
[26340.920342]  show_stack+0x52/0x5c
[26340.920348]  dump_stack_lvl+0x4a/0x63
[26340.920355]  dump_stack+0x10/0x16
[26340.920359]  spl_dumpstack+0x29/0x2f [spl]
[26340.920387] VERIFY3(sa.sa_magic == SA_MAGIC) failed (3636049806 == 3100762)
[26340.920390]  spl_panic+0xd1/0xe9 [spl]
[26340.920396] PANIC at zfs_quota.c:89:zpl_get_file_info()
[26340.920399] Showing stack for process 122167
[26340.920407]  ? taskq_dispatch_ent+0xec/0x150 [spl]
[26340.920421] VERIFY3(sa.sa_magic == SA_MAGIC) failed (1575889286 == 3100762)
[26340.920425] PANIC at zfs_quota.c:89:zpl_get_file_info()
[26340.920428] Showing stack for process 122166
[26340.920421]  ? zio_execute_stack_check.constprop.0+0x20/0x20 [zfs]
[26340.920803] VERIFY3(sa.sa_magic == SA_MAGIC) failed (2531917562 == 3100762)
[26340.920723]  ? spa_taskq_dispatch_ent+0x69/0xe0 [zfs]
[26340.920810] PANIC at zfs_quota.c:89:zpl_get_file_info()
[26340.920812] Showing stack for process 122163
[26340.920830] VERIFY3(sa.sa_magic == SA_MAGIC) failed (169363951 == 3100762)
[26340.920833] PANIC at zfs_quota.c:89:zpl_get_file_info()
[26340.920836] Showing stack for process 122162
[26340.920893]  ? zio_write+0x51/0x100 [zfs]
[26340.921121] VERIFY3(sa.sa_magic == SA_MAGIC) failed (1635322458 == 3100762)
[26340.921125] PANIC at zfs_quota.c:89:zpl_get_file_info()
[26340.921128] Showing stack for process 122160
[26340.921066]  ? zio_taskq_dispatch+0x74/0x140 [zfs]
[26340.921215] VERIFY3(sa.sa_magic == SA_MAGIC) failed (2105456492 == 3100762)
[26340.921217] PANIC at zfs_quota.c:89:zpl_get_file_info()
[26340.921219] Showing stack for process 122157
[26340.921241]  zpl_get_file_info+0x1fe/0x240 [zfs]
[26340.921407]  dmu_objset_userquota_get_ids+0x385/0x490 [zfs]
[26340.921532]  dnode_sync+0x557/0x620 [zfs]
[26340.921667]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[26340.921817]  dmu_objset_sync_dnodes+0x68/0x90 [zfs]
[26340.921936]  sync_dnodes_task+0x2b/0x50 [zfs]
[26340.922052]  taskq_thread+0x221/0x400 [spl]
[26340.922078]  ? wake_up_q+0x90/0x90
[26340.922084]  ? param_set_taskq_kick+0xf0/0xf0 [spl]
[26340.922096]  kthread+0x12a/0x150
[26340.922101]  ? set_kthread_struct+0x50/0x50
[26340.922104]  ret_from_fork+0x22/0x30
[26340.922110]  </TASK>
[26340.922113] CPU: 13 PID: 122157 Comm: dp_sync_taskq Tainted: P          IOE     5.15.0-48-generic #54-Ubuntu
[26340.922118] Hardware name: Hewlett-Packard HP Z600 Workstation/0AE8h, BIOS 786G4 v03.60 02/24/2016
[26340.922120] Call Trace:
[26340.922122]  <TASK>
[26340.922124]  show_stack+0x52/0x5c
[26340.922130]  dump_stack_lvl+0x4a/0x63
[26340.922137]  dump_stack+0x10/0x16
[26340.922140]  spl_dumpstack+0x29/0x2f [spl]
[26340.922153]  spl_panic+0xd1/0xe9 [spl]
[26340.922165]  ? taskq_dispatch_ent+0xec/0x150 [spl]
[26340.922179]  ? zio_execute_stack_check.constprop.0+0x20/0x20 [zfs]
[26340.922380]  ? spa_taskq_dispatch_ent+0x69/0xe0 [zfs]
[26340.922565]  ? zio_write+0x51/0x100 [zfs]
[26340.922752]  ? zio_taskq_dispatch+0x74/0x140 [zfs]
[26340.922889] VERIFY3(sa.sa_magic == SA_MAGIC) failed (4237141769 == 3100762)
[26340.922896] PANIC at zfs_quota.c:89:zpl_get_file_info()
[26340.922900] Showing stack for process 122158
[26340.922920]  zpl_get_file_info+0x1fe/0x240 [zfs]
[26340.923100]  dmu_objset_userquota_get_ids+0x385/0x490 [zfs]
[26340.923227]  dnode_sync+0x557/0x620 [zfs]
[26340.923370]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[26340.923512]  dmu_objset_sync_dnodes+0x68/0x90 [zfs]
[26340.923645]  sync_dnodes_task+0x2b/0x50 [zfs]
[26340.923780]  taskq_thread+0x221/0x400 [spl]
[26340.923799]  ? wake_up_q+0x90/0x90
[26340.923806]  ? param_set_taskq_kick+0xf0/0xf0 [spl]
[26340.923820]  kthread+0x12a/0x150
[26340.923823]  ? set_kthread_struct+0x50/0x50
[26340.923827]  ret_from_fork+0x22/0x30
[26340.923832]  </TASK>
[26340.923834] CPU: 2 PID: 122158 Comm: dp_sync_taskq Tainted: P          IOE     5.15.0-48-generic #54-Ubuntu
[26340.923839] Hardware name: Hewlett-Packard HP Z600 Workstation/0AE8h, BIOS 786G4 v03.60 02/24/2016
[26340.923842] Call Trace:
[26340.923844]  <TASK>
[26340.923847]  show_stack+0x52/0x5c
[26340.923853]  dump_stack_lvl+0x4a/0x63
[26340.923861]  dump_stack+0x10/0x16
[26340.923865]  spl_dumpstack+0x29/0x2f [spl]
[26340.923884]  spl_panic+0xd1/0xe9 [spl]
[26340.923898]  ? set_next_entity+0xbb/0x210
[26340.923905]  ? sched_clock+0x9/0x10
[26340.923910]  ? sched_clock_cpu+0x12/0xf0
[26340.923913]  ? psi_task_switch+0xc6/0x220
[26340.923918]  zpl_get_file_info+0x1fe/0x240 [zfs]
[26340.924117]  dmu_objset_userquota_get_ids+0x385/0x490 [zfs]
[26340.924251]  dnode_sync+0x557/0x620 [zfs]
[26340.924388]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[26340.924542]  dmu_objset_sync_dnodes+0x68/0x90 [zfs]
[26340.924673]  sync_dnodes_task+0x2b/0x50 [zfs]
[26340.924803]  taskq_thread+0x221/0x400 [spl]
[26340.924817]  ? wake_up_q+0x90/0x90
[26340.924823]  ? param_set_taskq_kick+0xf0/0xf0 [spl]
[26340.924837]  kthread+0x12a/0x150
[26340.924841]  ? set_kthread_struct+0x50/0x50
[26340.924844]  ret_from_fork+0x22/0x30
[26340.924850]  </TASK>
[26340.924852] CPU: 11 PID: 122167 Comm: dp_sync_taskq Tainted: P          IOE     5.15.0-48-generic #54-Ubuntu
[26340.924856] Hardware name: Hewlett-Packard HP Z600 Workstation/0AE8h, BIOS 786G4 v03.60 02/24/2016
[26340.924858] Call Trace:
[26340.924859]  <TASK>
[26340.924861]  show_stack+0x52/0x5c
[26340.924865]  dump_stack_lvl+0x4a/0x63
[26340.924869]  dump_stack+0x10/0x16
[26340.924872]  spl_dumpstack+0x29/0x2f [spl]
[26340.924884]  spl_panic+0xd1/0xe9 [spl]
[26340.924896]  ? taskq_dispatch_ent+0xec/0x150 [spl]
[26340.924908]  ? zio_execute_stack_check.constprop.0+0x20/0x20 [zfs]
[26340.925054]  ? spa_taskq_dispatch_ent+0x69/0xe0 [zfs]
[26340.925184]  ? zio_write+0x51/0x100 [zfs]
[26340.925330]  ? zio_taskq_dispatch+0x74/0x140 [zfs]
[26340.925477]  zpl_get_file_info+0x1fe/0x240 [zfs]
[26340.925622]  dmu_objset_userquota_get_ids+0x385/0x490 [zfs]
[26340.925723]  dnode_sync+0x557/0x620 [zfs]
[26340.925832]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[26340.925958]  dmu_objset_sync_dnodes+0x68/0x90 [zfs]
[26340.926060]  sync_dnodes_task+0x2b/0x50 [zfs]
[26340.926161]  taskq_thread+0x221/0x400 [spl]
[26340.926173]  ? wake_up_q+0x90/0x90
[26340.926178]  ? param_set_taskq_kick+0xf0/0xf0 [spl]
[26340.926189]  kthread+0x12a/0x150
[26340.926192]  ? set_kthread_struct+0x50/0x50
[26340.926195]  ret_from_fork+0x22/0x30
[26340.926199]  </TASK>
[26340.926201] CPU: 12 PID: 122166 Comm: dp_sync_taskq Tainted: P          IOE     5.15.0-48-generic #54-Ubuntu
[26340.926205] Hardware name: Hewlett-Packard HP Z600 Workstation/0AE8h, BIOS 786G4 v03.60 02/24/2016
[26340.926206] Call Trace:
[26340.926208]  <TASK>
[26340.926209]  show_stack+0x52/0x5c
[26340.926213]  dump_stack_lvl+0x4a/0x63
[26340.926217]  dump_stack+0x10/0x16
[26340.926220]  spl_dumpstack+0x29/0x2f [spl]
[26340.926229]  spl_panic+0xd1/0xe9 [spl]
[26340.926239]  ? taskq_dispatch_ent+0xec/0x150 [spl]
[26340.926248]  ? zio_execute_stack_check.constprop.0+0x20/0x20 [zfs]
[26340.926367]  ? spa_taskq_dispatch_ent+0x69/0xe0 [zfs]
[26340.926472]  ? zio_write+0x51/0x100 [zfs]
[26340.926588]  ? zio_taskq_dispatch+0x74/0x140 [zfs]
[26340.926705]  zpl_get_file_info+0x1fe/0x240 [zfs]
[26340.926822]  dmu_objset_userquota_get_ids+0x385/0x490 [zfs]
[26340.926904]  dnode_sync+0x557/0x620 [zfs]
[26340.926991]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[26340.927092]  dmu_objset_sync_dnodes+0x68/0x90 [zfs]
[26340.927173]  sync_dnodes_task+0x2b/0x50 [zfs]
[26340.927254]  taskq_thread+0x221/0x400 [spl]
[26340.927264]  ? wake_up_q+0x90/0x90
[26340.927267]  ? param_set_taskq_kick+0xf0/0xf0 [spl]
[26340.927276]  kthread+0x12a/0x150
[26340.927278]  ? set_kthread_struct+0x50/0x50
[26340.927281]  ret_from_fork+0x22/0x30
[26340.927284]  </TASK>
[26340.927286] CPU: 9 PID: 122162 Comm: dp_sync_taskq Tainted: P          IOE     5.15.0-48-generic #54-Ubuntu
[26340.927293] Hardware name: Hewlett-Packard HP Z600 Workstation/0AE8h, BIOS 786G4 v03.60 02/24/2016
[26340.927295] Call Trace:
[26340.927298]  <TASK>
[26340.927300]  show_stack+0x52/0x5c
[26340.927305]  dump_stack_lvl+0x4a/0x63
[26340.927311]  dump_stack+0x10/0x16
[26340.927315]  spl_dumpstack+0x29/0x2f [spl]
[26340.927334]  spl_panic+0xd1/0xe9 [spl]
[26340.927354]  ? taskq_dispatch_ent+0xec/0x150 [spl]
[26340.927372]  ? zio_execute_stack_check.constprop.0+0x20/0x20 [zfs]
[26340.927603]  ? spa_taskq_dispatch_ent+0x69/0xe0 [zfs]
[26340.927826]  ? zio_write+0x51/0x100 [zfs]
[26340.928057]  ? zio_taskq_dispatch+0x74/0x140 [zfs]
[26340.928288]  zpl_get_file_info+0x1fe/0x240 [zfs]
[26340.928518]  dmu_objset_userquota_get_ids+0x385/0x490 [zfs]
[26340.928679]  dnode_sync+0x557/0x620 [zfs]
[26340.928850]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[26340.929049]  dmu_objset_sync_dnodes+0x68/0x90 [zfs]
[26340.929209]  sync_dnodes_task+0x2b/0x50 [zfs]
[26340.929369]  taskq_thread+0x221/0x400 [spl]
[26340.929387]  ? wake_up_q+0x90/0x90
[26340.929394]  ? param_set_taskq_kick+0xf0/0xf0 [spl]
[26340.929411]  kthread+0x12a/0x150
[26340.929416]  ? set_kthread_struct+0x50/0x50
[26340.929420]  ret_from_fork+0x22/0x30
[26340.929426]  </TASK>
[26340.929429] CPU: 6 PID: 122160 Comm: dp_sync_taskq Tainted: P          IOE     5.15.0-48-generic #54-Ubuntu
[26340.929432] Hardware name: Hewlett-Packard HP Z600 Workstation/0AE8h, BIOS 786G4 v03.60 02/24/2016
[26340.929433] Call Trace:
[26340.929435]  <TASK>
[26340.929437]  show_stack+0x52/0x5c
[26340.929441]  dump_stack_lvl+0x4a/0x63
[26340.929445]  dump_stack+0x10/0x16
[26340.929448]  spl_dumpstack+0x29/0x2f [spl]
[26340.929461]  spl_panic+0xd1/0xe9 [spl]
[26340.929474]  ? taskq_dispatch_ent+0xec/0x150 [spl]
[26340.929486]  ? zio_execute_stack_check.constprop.0+0x20/0x20 [zfs]
[26340.929625]  ? spa_taskq_dispatch_ent+0x69/0xe0 [zfs]
[26340.929777]  ? zio_write+0x51/0x100 [zfs]
[26340.929945]  ? zio_taskq_dispatch+0x74/0x140 [zfs]
[26340.930115]  zpl_get_file_info+0x1fe/0x240 [zfs]
[26340.930295]  dmu_objset_userquota_get_ids+0x385/0x490 [zfs]
[26340.930430]  dnode_sync+0x557/0x620 [zfs]
[26340.930550]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[26340.930652]  dmu_objset_sync_dnodes+0x68/0x90 [zfs]
[26340.930735]  sync_dnodes_task+0x2b/0x50 [zfs]
[26340.930826]  taskq_thread+0x221/0x400 [spl]
[26340.930837]  ? wake_up_q+0x90/0x90
[26340.930841]  ? param_set_taskq_kick+0xf0/0xf0 [spl]
[26340.930852]  kthread+0x12a/0x150
[26340.930855]  ? set_kthread_struct+0x50/0x50
[26340.930857]  ret_from_fork+0x22/0x30
[26340.930860]  </TASK>
[26340.930863] CPU: 3 PID: 122163 Comm: dp_sync_taskq Tainted: P          IOE     5.15.0-48-generic #54-Ubuntu
[26340.930869] Hardware name: Hewlett-Packard HP Z600 Workstation/0AE8h, BIOS 786G4 v03.60 02/24/2016
[26340.930871] Call Trace:
[26340.930873]  <TASK>
[26340.930875]  show_stack+0x52/0x5c
[26340.930880]  dump_stack_lvl+0x4a/0x63
[26340.930885]  dump_stack+0x10/0x16
[26340.930889]  spl_dumpstack+0x29/0x2f [spl]
[26340.930908]  spl_panic+0xd1/0xe9 [spl]
[26340.930928]  ? taskq_dispatch_ent+0xec/0x150 [spl]
[26340.930945]  ? zio_execute_stack_check.constprop.0+0x20/0x20 [zfs]
[26340.931138]  ? spa_taskq_dispatch_ent+0x69/0xe0 [zfs]
[26340.931291]  ? zio_write+0x51/0x100 [zfs]
[26340.931454]  ? zio_taskq_dispatch+0x74/0x140 [zfs]
[26340.931619]  zpl_get_file_info+0x1fe/0x240 [zfs]
[26340.931787]  dmu_objset_userquota_get_ids+0x385/0x490 [zfs]
[26340.931902]  dnode_sync+0x557/0x620 [zfs]
[26340.932023]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[26340.932165]  dmu_objset_sync_dnodes+0x68/0x90 [zfs]
[26340.932323]  sync_dnodes_task+0x2b/0x50 [zfs]
[26340.932438]  taskq_thread+0x221/0x400 [spl]
[26340.932448]  ? wake_up_q+0x90/0x90
[26340.932455]  ? param_set_taskq_kick+0xf0/0xf0 [spl]
[26340.932466]  kthread+0x12a/0x150
[26340.932469]  ? set_kthread_struct+0x50/0x50
[26340.932471]  ret_from_fork+0x22/0x30
[26340.932474]  </TASK>
[26463.126525] INFO: task dp_sync_taskq:122157 blocked for more than 120 seconds.
[26463.126536]       Tainted: P          IOE     5.15.0-48-generic #54-Ubuntu
[26463.126539] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[26463.126542] task:dp_sync_taskq   state:D stack:    0 pid:122157 ppid:     2 flags:0x00004000
[26463.126548] Call Trace:
[26463.126551]  <TASK>
[26463.126556]  __schedule+0x23d/0x5a0
[26463.126567]  schedule+0x4e/0xc0
[26463.126572]  spl_panic+0xe7/0xe9 [spl]
[26463.126596]  ? taskq_dispatch_ent+0xec/0x150 [spl]
[26463.126614]  ? zio_execute_stack_check.constprop.0+0x20/0x20 [zfs]
[26463.126866]  ? spa_taskq_dispatch_ent+0x69/0xe0 [zfs]
[26463.127072]  ? zio_write+0x51/0x100 [zfs]
[26463.127302]  ? zio_taskq_dispatch+0x74/0x140 [zfs]
[26463.127534]  zpl_get_file_info+0x1fe/0x240 [zfs]
[26463.127764]  dmu_objset_userquota_get_ids+0x385/0x490 [zfs]
[26463.127925]  dnode_sync+0x557/0x620 [zfs]
[26463.128096]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[26463.128295]  dmu_objset_sync_dnodes+0x68/0x90 [zfs]
[26463.128455]  sync_dnodes_task+0x2b/0x50 [zfs]
[26463.128615]  taskq_thread+0x221/0x400 [spl]
[26463.128633]  ? wake_up_q+0x90/0x90
[26463.128642]  ? param_set_taskq_kick+0xf0/0xf0 [spl]
[26463.128660]  kthread+0x12a/0x150
[26463.128665]  ? set_kthread_struct+0x50/0x50
[26463.128670]  ret_from_fork+0x22/0x30
[26463.128677]  </TASK>
[26463.128679] INFO: task dp_sync_taskq:122158 blocked for more than 120 seconds.
[26463.128683]       Tainted: P          IOE     5.15.0-48-generic #54-Ubuntu
[26463.128686] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[26463.128688] task:dp_sync_taskq   state:D stack:    0 pid:122158 ppid:     2 flags:0x00004000
[26463.128693] Call Trace:
[26463.128695]  <TASK>
[26463.128697]  __schedule+0x23d/0x5a0
[26463.128702]  schedule+0x4e/0xc0
[26463.128706]  spl_panic+0xe7/0xe9 [spl]
[26463.128726]  ? set_next_entity+0xbb/0x210
[26463.128732]  ? sched_clock+0x9/0x10
[26463.128737]  ? sched_clock_cpu+0x12/0xf0
[26463.128741]  ? psi_task_switch+0xc6/0x220
[26463.128747]  zpl_get_file_info+0x1fe/0x240 [zfs]
[26463.128977]  dmu_objset_userquota_get_ids+0x385/0x490 [zfs]
[26463.129138]  dnode_sync+0x557/0x620 [zfs]
[26463.129309]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[26463.129508]  dmu_objset_sync_dnodes+0x68/0x90 [zfs]
[26463.129669]  sync_dnodes_task+0x2b/0x50 [zfs]
[26463.129829]  taskq_thread+0x221/0x400 [spl]
[26463.129847]  ? wake_up_q+0x90/0x90
[26463.129854]  ? param_set_taskq_kick+0xf0/0xf0 [spl]
[26463.129871]  kthread+0x12a/0x150
[26463.129876]  ? set_kthread_struct+0x50/0x50
[26463.129881]  ret_from_fork+0x22/0x30
[26463.129887]  </TASK>
[26463.129889] INFO: task dp_sync_taskq:122160 blocked for more than 120 seconds.
[26463.129893]       Tainted: P          IOE     5.15.0-48-generic #54-Ubuntu
[26463.129896] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[26463.129899] task:dp_sync_taskq   state:D stack:    0 pid:122160 ppid:     2 flags:0x00004000
[26463.129903] Call Trace:
[26463.129905]  <TASK>
[26463.129907]  __schedule+0x23d/0x5a0
[26463.129912]  schedule+0x4e/0xc0
[26463.129916]  spl_panic+0xe7/0xe9 [spl]
[26463.129936]  ? taskq_dispatch_ent+0xec/0x150 [spl]
[26463.129954]  ? zio_execute_stack_check.constprop.0+0x20/0x20 [zfs]
[26463.130206]  ? spa_taskq_dispatch_ent+0x69/0xe0 [zfs]
[26463.130425]  ? zio_write+0x51/0x100 [zfs]
[26463.130738]  ? zio_taskq_dispatch+0x74/0x140 [zfs]
[26463.131018]  zpl_get_file_info+0x1fe/0x240 [zfs]
[26463.131250]  dmu_objset_userquota_get_ids+0x385/0x490 [zfs]
[26463.131411]  dnode_sync+0x557/0x620 [zfs]
[26463.131583]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[26463.131782]  dmu_objset_sync_dnodes+0x68/0x90 [zfs]
[26463.131943]  sync_dnodes_task+0x2b/0x50 [zfs]
[26463.132104]  taskq_thread+0x221/0x400 [spl]
[26463.132122]  ? wake_up_q+0x90/0x90
[26463.132129]  ? param_set_taskq_kick+0xf0/0xf0 [spl]
[26463.132147]  kthread+0x12a/0x150
[26463.132152]  ? set_kthread_struct+0x50/0x50
[26463.132157]  ret_from_fork+0x22/0x30
[26463.132163]  </TASK>
[26463.132166] INFO: task dp_sync_taskq:122162 blocked for more than 120 seconds.
[26463.132170]       Tainted: P          IOE     5.15.0-48-generic #54-Ubuntu
[26463.132173] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[26463.132176] task:dp_sync_taskq   state:D stack:    0 pid:122162 ppid:     2 flags:0x00004000
[26463.132181] Call Trace:
[26463.132183]  <TASK>
[26463.132185]  __schedule+0x23d/0x5a0
[26463.132191]  schedule+0x4e/0xc0
[26463.132195]  spl_panic+0xe7/0xe9 [spl]
[26463.132214]  ? taskq_dispatch_ent+0xec/0x150 [spl]
[26463.132232]  ? zio_execute_stack_check.constprop.0+0x20/0x20 [zfs]
[26463.132463]  ? spa_taskq_dispatch_ent+0x69/0xe0 [zfs]
[26463.132670]  ? zio_write+0x51/0x100 [zfs]
[26463.132901]  ? zio_taskq_dispatch+0x74/0x140 [zfs]
[26463.133132]  zpl_get_file_info+0x1fe/0x240 [zfs]
[26463.133362]  dmu_objset_userquota_get_ids+0x385/0x490 [zfs]
[26463.133524]  dnode_sync+0x557/0x620 [zfs]
[26463.133695]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[26463.133894]  dmu_objset_sync_dnodes+0x68/0x90 [zfs]
[26463.134055]  sync_dnodes_task+0x2b/0x50 [zfs]
[26463.134236]  taskq_thread+0x221/0x400 [spl]
[26463.134255]  ? wake_up_q+0x90/0x90
[26463.134262]  ? param_set_taskq_kick+0xf0/0xf0 [spl]
[26463.134279]  kthread+0x12a/0x150
[26463.134284]  ? set_kthread_struct+0x50/0x50
[26463.134289]  ret_from_fork+0x22/0x30
[26463.134295]  </TASK>
[26463.134297] INFO: task dp_sync_taskq:122163 blocked for more than 120 seconds.
[26463.134303]       Tainted: P          IOE     5.15.0-48-generic #54-Ubuntu
[26463.134308] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[26463.134311] task:dp_sync_taskq   state:D stack:    0 pid:122163 ppid:     2 flags:0x00004000
[26463.134318] Call Trace:
[26463.134320]  <TASK>
[26463.134323]  __schedule+0x23d/0x5a0
[26463.134331]  schedule+0x4e/0xc0
[26463.134336]  spl_panic+0xe7/0xe9 [spl]
[26463.134364]  ? taskq_dispatch_ent+0xec/0x150 [spl]
[26463.134390]  ? zio_execute_stack_check.constprop.0+0x20/0x20 [zfs]
[26463.134689]  ? spa_taskq_dispatch_ent+0x69/0xe0 [zfs]
[26463.134897]  ? zio_write+0x51/0x100 [zfs]
[26463.135127]  ? zio_taskq_dispatch+0x74/0x140 [zfs]
[26463.135359]  zpl_get_file_info+0x1fe/0x240 [zfs]
[26463.135590]  dmu_objset_userquota_get_ids+0x385/0x490 [zfs]
[26463.135751]  dnode_sync+0x557/0x620 [zfs]
[26463.135922]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[26463.136122]  dmu_objset_sync_dnodes+0x68/0x90 [zfs]
[26463.136283]  sync_dnodes_task+0x2b/0x50 [zfs]
[26463.136443]  taskq_thread+0x221/0x400 [spl]
[26463.136462]  ? wake_up_q+0x90/0x90
[26463.136469]  ? param_set_taskq_kick+0xf0/0xf0 [spl]
[26463.136486]  kthread+0x12a/0x150
[26463.136491]  ? set_kthread_struct+0x50/0x50
[26463.136496]  ret_from_fork+0x22/0x30
[26463.136502]  </TASK>
[26463.136506] INFO: task dp_sync_taskq:122166 blocked for more than 120 seconds.
[26463.136511]       Tainted: P          IOE     5.15.0-48-generic #54-Ubuntu
[26463.136514] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[26463.136516] task:dp_sync_taskq   state:D stack:    0 pid:122166 ppid:     2 flags:0x00004000
[26463.136521] Call Trace:
[26463.136523]  <TASK>
[26463.136525]  __schedule+0x23d/0x5a0
[26463.136530]  schedule+0x4e/0xc0
[26463.136534]  spl_panic+0xe7/0xe9 [spl]
[26463.136553]  ? taskq_dispatch_ent+0xec/0x150 [spl]
[26463.136572]  ? zio_execute_stack_check.constprop.0+0x20/0x20 [zfs]
[26463.136803]  ? spa_taskq_dispatch_ent+0x69/0xe0 [zfs]
[26463.137010]  ? zio_write+0x51/0x100 [zfs]
[26463.137241]  ? zio_taskq_dispatch+0x74/0x140 [zfs]
[26463.137472]  zpl_get_file_info+0x1fe/0x240 [zfs]
[26463.137703]  dmu_objset_userquota_get_ids+0x385/0x490 [zfs]
[26463.137864]  dnode_sync+0x557/0x620 [zfs]
[26463.138035]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[26463.138249]  dmu_objset_sync_dnodes+0x68/0x90 [zfs]
[26463.138438]  sync_dnodes_task+0x2b/0x50 [zfs]
[26463.138673]  taskq_thread+0x221/0x400 [spl]
[26463.138692]  ? wake_up_q+0x90/0x90
[26463.138699]  ? param_set_taskq_kick+0xf0/0xf0 [spl]
[26463.138717]  kthread+0x12a/0x150
[26463.138721]  ? set_kthread_struct+0x50/0x50
[26463.138726]  ret_from_fork+0x22/0x30
[26463.138733]  </TASK>
[26463.138735] INFO: task dp_sync_taskq:122167 blocked for more than 120 seconds.
[26463.138739]       Tainted: P          IOE     5.15.0-48-generic #54-Ubuntu
[26463.138742] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[26463.138745] task:dp_sync_taskq   state:D stack:    0 pid:122167 ppid:     2 flags:0x00004000
[26463.138749] Call Trace:
[26463.138751]  <TASK>
[26463.138753]  __schedule+0x23d/0x5a0
[26463.138758]  schedule+0x4e/0xc0
[26463.138762]  spl_panic+0xe7/0xe9 [spl]
[26463.138782]  ? taskq_dispatch_ent+0xec/0x150 [spl]
[26463.138800]  ? zio_execute_stack_check.constprop.0+0x20/0x20 [zfs]
[26463.139032]  ? spa_taskq_dispatch_ent+0x69/0xe0 [zfs]
[26463.139238]  ? zio_write+0x51/0x100 [zfs]
[26463.139469]  ? zio_taskq_dispatch+0x74/0x140 [zfs]
[26463.139700]  zpl_get_file_info+0x1fe/0x240 [zfs]
[26463.139931]  dmu_objset_userquota_get_ids+0x385/0x490 [zfs]
[26463.140092]  dnode_sync+0x557/0x620 [zfs]
[26463.140263]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[26463.140463]  dmu_objset_sync_dnodes+0x68/0x90 [zfs]
[26463.140623]  sync_dnodes_task+0x2b/0x50 [zfs]
[26463.140784]  taskq_thread+0x221/0x400 [spl]
[26463.140802]  ? wake_up_q+0x90/0x90
[26463.140809]  ? param_set_taskq_kick+0xf0/0xf0 [spl]
[26463.140826]  kthread+0x12a/0x150
[26463.140831]  ? set_kthread_struct+0x50/0x50
[26463.140836]  ret_from_fork+0x22/0x30
[26463.140842]  </TASK>
[26463.140844] INFO: task dp_sync_taskq:122168 blocked for more than 120 seconds.
[26463.140848]       Tainted: P          IOE     5.15.0-48-generic #54-Ubuntu
[26463.140851] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[26463.140853] task:dp_sync_taskq   state:D stack:    0 pid:122168 ppid:     2 flags:0x00004000
[26463.140858] Call Trace:
[26463.140859]  <TASK>
[26463.140862]  __schedule+0x23d/0x5a0
[26463.140867]  schedule+0x4e/0xc0
[26463.140871]  spl_panic+0xe7/0xe9 [spl]
[26463.140890]  ? taskq_dispatch_ent+0xec/0x150 [spl]
[26463.140908]  ? zio_execute_stack_check.constprop.0+0x20/0x20 [zfs]
[26463.141140]  ? spa_taskq_dispatch_ent+0x69/0xe0 [zfs]
[26463.141346]  ? zio_write+0x51/0x100 [zfs]
[26463.141577]  ? zio_taskq_dispatch+0x74/0x140 [zfs]
[26463.141808]  zpl_get_file_info+0x1fe/0x240 [zfs]
[26463.142039]  dmu_objset_userquota_get_ids+0x385/0x490 [zfs]
[26463.142212]  dnode_sync+0x557/0x620 [zfs]
[26463.142408]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[26463.142691]  dmu_objset_sync_dnodes+0x68/0x90 [zfs]
[26463.142854]  sync_dnodes_task+0x2b/0x50 [zfs]
[26463.143014]  taskq_thread+0x221/0x400 [spl]
[26463.143033]  ? wake_up_q+0x90/0x90
[26463.143040]  ? param_set_taskq_kick+0xf0/0xf0 [spl]
[26463.143058]  kthread+0x12a/0x150
[26463.143063]  ? set_kthread_struct+0x50/0x50
[26463.143067]  ret_from_fork+0x22/0x30
[26463.143074]  </TASK>
[26463.143079] INFO: task txg_sync:122766 blocked for more than 120 seconds.
[26463.143084]       Tainted: P          IOE     5.15.0-48-generic #54-Ubuntu
[26463.143087] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[26463.143090] task:txg_sync        state:D stack:    0 pid:122766 ppid:     2 flags:0x00004000
[26463.143095] Call Trace:
[26463.143097]  <TASK>
[26463.143099]  __schedule+0x23d/0x5a0
[26463.143104]  schedule+0x4e/0xc0
[26463.143108]  taskq_wait+0x68/0xa0 [spl]
[26463.143125]  ? wait_woken+0x70/0x70
[26463.143131]  dmu_objset_sync+0x2d1/0x470 [zfs]
[26463.143295]  dsl_dataset_sync+0x67/0x2f0 [zfs]
[26463.143471]  dsl_pool_sync+0x9b/0x400 [zfs]
[26463.143658]  ? __mod_timer+0x214/0x400
[26463.143664]  spa_sync_iterate_to_convergence+0xe0/0x1f0 [zfs]
[26463.143871]  spa_sync+0x2dc/0x5b0 [zfs]
[26463.144078]  txg_sync_thread+0x266/0x2f0 [zfs]
[26463.144296]  ? txg_dispatch_callbacks+0x100/0x100 [zfs]
[26463.144513]  thread_generic_wrapper+0x64/0x80 [spl]
[26463.144531]  ? __thread_exit+0x20/0x20 [spl]
[26463.144549]  kthread+0x12a/0x150
[26463.144554]  ? set_kthread_struct+0x50/0x50
[26463.144559]  ret_from_fork+0x22/0x30
[26463.144565]  </TASK>
[26583.956972] INFO: task dp_sync_taskq:122157 blocked for more than 241 seconds.
[26583.956982]       Tainted: P          IOE     5.15.0-48-generic #54-Ubuntu
[26583.956986] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[26583.956988] task:dp_sync_taskq   state:D stack:    0 pid:122157 ppid:     2 flags:0x00004000
[26583.956995] Call Trace:
[26583.956998]  <TASK>
[26583.957002]  __schedule+0x23d/0x5a0
[26583.957013]  schedule+0x4e/0xc0
[26583.957018]  spl_panic+0xe7/0xe9 [spl]
[26583.957042]  ? taskq_dispatch_ent+0xec/0x150 [spl]
[26583.957061]  ? zio_execute_stack_check.constprop.0+0x20/0x20 [zfs]
[26583.957313]  ? spa_taskq_dispatch_ent+0x69/0xe0 [zfs]
[26583.957519]  ? zio_write+0x51/0x100 [zfs]
[26583.957750]  ? zio_taskq_dispatch+0x74/0x140 [zfs]
[26583.957981]  zpl_get_file_info+0x1fe/0x240 [zfs]
[26583.958211]  dmu_objset_userquota_get_ids+0x385/0x490 [zfs]
[26583.958371]  dnode_sync+0x557/0x620 [zfs]
[26583.958542]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[26583.958742]  dmu_objset_sync_dnodes+0x68/0x90 [zfs]
[26583.958902]  sync_dnodes_task+0x2b/0x50 [zfs]
[26583.959062]  taskq_thread+0x221/0x400 [spl]
[26583.959081]  ? wake_up_q+0x90/0x90
[26583.959089]  ? param_set_taskq_kick+0xf0/0xf0 [spl]
[26583.959106]  kthread+0x12a/0x150
[26583.959111]  ? set_kthread_struct+0x50/0x50
[26583.959116]  ret_from_fork+0x22/0x30
[26583.959123]  </TASK>

Panic during rsync:

zpool import -f -N pool1
zpool import -f -N pool2
zfs load-key pool1/dataset1
mount -t zfs pool1/dataset1 /mnt/1
mount -t zfs pool2/dataset4-rsync /mnt/2
rsync -avpxHES --progress /mnt/1/. /mnt/2/.

gives errors:

[  359.207437] Process accounting resumed
[ 1569.423938] perf: interrupt took too long (2518 > 2500), lowering kernel.perf_event_max_sample_rate to 79250
[ 1755.729478] perf: interrupt took too long (3428 > 3147), lowering kernel.perf_event_max_sample_rate to 58250
[ 1952.858845] perf: interrupt took too long (4307 > 4285), lowering kernel.perf_event_max_sample_rate to 46250
[ 2157.058386] perf: interrupt took too long (5403 > 5383), lowering kernel.perf_event_max_sample_rate to 37000
[ 2402.056405] perf: interrupt took too long (6845 > 6753), lowering kernel.perf_event_max_sample_rate to 29000
[ 3015.416455] perf: interrupt took too long (8892 > 8556), lowering kernel.perf_event_max_sample_rate to 22250
[ 3356.060044] perf: interrupt took too long (11189 > 11115), lowering kernel.perf_event_max_sample_rate to 17750
[ 3960.051537] Process accounting resumed
[10079.546729] VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed
[10079.546740] PANIC at zfs_znode.c:339:zfs_znode_sa_init()
[10079.546743] Showing stack for process 847563
[10079.546747] CPU: 9 PID: 847563 Comm: rsync Tainted: P          IOE     5.15.0-48-generic #54-Ubuntu
[10079.546751] Hardware name: Hewlett-Packard HP Z600 Workstation/0AE8h, BIOS 786G4 v03.60 02/24/2016
[10079.546754] Call Trace:
[10079.546758]  <TASK>
[10079.546762]  show_stack+0x52/0x5c
[10079.546771]  dump_stack_lvl+0x4a/0x63
[10079.546781]  dump_stack+0x10/0x16
[10079.546785]  spl_dumpstack+0x29/0x2f [spl]
[10079.546806]  spl_panic+0xd1/0xe9 [spl]
[10079.546826]  ? __cond_resched+0x1a/0x50
[10079.546832]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[10079.546999]  ? __raw_spin_unlock+0x9/0x10 [zfs]
[10079.547148]  ? dmu_buf_replace_user+0x65/0x80 [zfs]
[10079.547300]  ? dmu_buf_set_user+0x13/0x20 [zfs]
[10079.547452]  ? dmu_buf_set_user_ie+0x15/0x20 [zfs]
[10079.547604]  zfs_znode_sa_init+0xdd/0xf0 [zfs]
[10079.547834]  zfs_znode_alloc+0x101/0x5d0 [zfs]
[10079.548063]  ? arc_space_consume+0x54/0x100 [zfs]
[10079.548203]  ? dbuf_read_bonus+0x102/0x110 [zfs]
[10079.548353]  ? RW_WRITE_HELD+0x9/0x30 [zfs]
[10079.548502]  ? dmu_buf_unlock_parent+0x65/0xa0 [zfs]
[10079.548654]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[10079.548803]  ? dbuf_read_impl.constprop.0+0x352/0x490 [zfs]
[10079.548953]  ? dbuf_rele_and_unlock+0x134/0x540 [zfs]
[10079.549103]  ? __cond_resched+0x1a/0x50
[10079.549107]  ? __cond_resched+0x1a/0x50
[10079.549111]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[10079.549248]  ? __raw_spin_unlock+0x9/0x10 [zfs]
[10079.549385]  ? __cond_resched+0x1a/0x50
[10079.549389]  ? down_read+0x13/0xa0
[10079.549394]  ? wmsum_add+0xe/0x20 [zfs]
[10079.549543]  ? dbuf_read+0x1e6/0x5b0 [zfs]
[10079.549693]  ? __cond_resched+0x1a/0x50
[10079.549697]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[10079.549865]  ? __raw_spin_unlock+0x9/0x10 [zfs]
[10079.550033]  ? dnode_rele_and_unlock+0x69/0xf0 [zfs]
[10079.550202]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[10079.550358]  ? RW_WRITE_HELD+0x9/0x30 [zfs]
[10079.550513]  ? dmu_object_info_from_dnode+0x92/0xa0 [zfs]
[10079.550671]  zfs_zget+0x239/0x280 [zfs]
[10079.550900]  zfs_dirent_lock+0x424/0x570 [zfs]
[10079.551129]  zfs_dirlook+0x91/0x2d0 [zfs]
[10079.551357]  zfs_lookup+0x25e/0x410 [zfs]
[10079.551586]  zpl_lookup+0xcf/0x230 [zfs]
[10079.551814]  ? capable_wrt_inode_uidgid+0x5f/0x70
[10079.551822]  __lookup_slow+0x88/0x150
[10079.551829]  walk_component+0x145/0x1c0
[10079.551833]  ? zfs_dirty_inode+0x2af/0x330 [zfs]
[10079.552061]  path_lookupat+0x6e/0x1c0
[10079.552066]  filename_lookup+0xcf/0x1d0
[10079.552071]  ? __check_object_size+0x1d/0x30
[10079.552077]  ? strncpy_from_user+0x44/0x150
[10079.552084]  ? getname_flags.part.0+0x4c/0x1b0
[10079.552089]  user_path_at_empty+0x3f/0x60
[10079.552093]  vfs_statx+0x7a/0x130
[10079.552099]  __do_sys_newfstatat+0x36/0x70
[10079.552103]  ? exit_to_user_mode_prepare+0x96/0xb0
[10079.552111]  ? syscall_exit_to_user_mode+0x27/0x50
[10079.552118]  ? do_syscall_64+0x69/0xc0
[10079.552121]  ? exit_to_user_mode_prepare+0x37/0xb0
[10079.552126]  ? syscall_exit_to_user_mode+0x27/0x50
[10079.552130]  ? __x64_sys_newfstatat+0x1c/0x30
[10079.552134]  ? do_syscall_64+0x69/0xc0
[10079.552137]  __x64_sys_newfstatat+0x1c/0x30
[10079.552142]  do_syscall_64+0x5c/0xc0
[10079.552145]  ? do_syscall_64+0x69/0xc0
[10079.552148]  ? syscall_exit_to_user_mode+0x27/0x50
[10079.552152]  ? __x64_sys_close+0x11/0x50
[10079.552157]  ? do_syscall_64+0x69/0xc0
[10079.552160]  entry_SYSCALL_64_after_hwframe+0x61/0xcb
[10079.552166] RIP: 0033:0x7fbbb12afeee
[10079.552171] Code: 48 89 f2 b9 00 01 00 00 48 89 fe bf 9c ff ff ff e9 07 00 00 00 0f 1f 80 00 00 00 00 f3 0f 1e fa 41 89 ca b8 06 01 00 00 0f 05 <3d> 00 f0 ff ff 77 0b 31 c0 c3 0f 1f 84 00 00 00 00 00 48 8b 15 09
[10079.552174] RSP: 002b:00007ffc2be9a998 EFLAGS: 00000246 ORIG_RAX: 0000000000000106
[10079.552179] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fbbb12afeee
[10079.552182] RDX: 00007ffc2be9aaa0 RSI: 00007ffc2be9bb30 RDI: 00000000ffffff9c
[10079.552185] RBP: 00007ffc2be9aaa0 R08: 00007ffc2be9bb2f R09: 0000000000000000
[10079.552187] R10: 0000000000000100 R11: 0000000000000246 R12: 0000000000000000
[10079.552190] R13: 00007ffc2be9bb30 R14: 000055776e8adfb0 R15: 00007ffc2be9aaa0
[10079.552194]  </TASK>
[10271.494666] INFO: task rsync:847563 blocked for more than 120 seconds.
[10271.494676]       Tainted: P          IOE     5.15.0-48-generic #54-Ubuntu
[10271.494679] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[10271.494682] task:rsync           state:D stack:    0 pid:847563 ppid: 46824 flags:0x00004002
[10271.494689] Call Trace:
[10271.494692]  <TASK>
[10271.494696]  __schedule+0x23d/0x5a0
[10271.494708]  schedule+0x4e/0xc0
[10271.494712]  spl_panic+0xe7/0xe9 [spl]
[10271.494737]  ? __cond_resched+0x1a/0x50
[10271.494741]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[10271.494915]  ? __raw_spin_unlock+0x9/0x10 [zfs]
[10271.495064]  ? dmu_buf_replace_user+0x65/0x80 [zfs]
[10271.495216]  ? dmu_buf_set_user+0x13/0x20 [zfs]
[10271.495368]  ? dmu_buf_set_user_ie+0x15/0x20 [zfs]
[10271.495521]  zfs_znode_sa_init+0xdd/0xf0 [zfs]
[10271.495750]  zfs_znode_alloc+0x101/0x5d0 [zfs]
[10271.495979]  ? arc_space_consume+0x54/0x100 [zfs]
[10271.496119]  ? dbuf_read_bonus+0x102/0x110 [zfs]
[10271.496269]  ? RW_WRITE_HELD+0x9/0x30 [zfs]
[10271.496418]  ? dmu_buf_unlock_parent+0x65/0xa0 [zfs]
[10271.496570]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[10271.496719]  ? dbuf_read_impl.constprop.0+0x352/0x490 [zfs]
[10271.496869]  ? dbuf_rele_and_unlock+0x134/0x540 [zfs]
[10271.497019]  ? __cond_resched+0x1a/0x50
[10271.497024]  ? __cond_resched+0x1a/0x50
[10271.497027]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[10271.497165]  ? __raw_spin_unlock+0x9/0x10 [zfs]
[10271.497302]  ? __cond_resched+0x1a/0x50
[10271.497306]  ? down_read+0x13/0xa0
[10271.497311]  ? wmsum_add+0xe/0x20 [zfs]
[10271.497460]  ? dbuf_read+0x1e6/0x5b0 [zfs]
[10271.497611]  ? __cond_resched+0x1a/0x50
[10271.497615]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[10271.497782]  ? __raw_spin_unlock+0x9/0x10 [zfs]
[10271.497950]  ? dnode_rele_and_unlock+0x69/0xf0 [zfs]
[10271.498119]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[10271.498276]  ? RW_WRITE_HELD+0x9/0x30 [zfs]
[10271.498461]  ? dmu_object_info_from_dnode+0x92/0xa0 [zfs]
[10271.498703]  zfs_zget+0x239/0x280 [zfs]
[10271.498840]  zfs_dirent_lock+0x424/0x570 [zfs]
[10271.498978]  zfs_dirlook+0x91/0x2d0 [zfs]
[10271.499206]  zfs_lookup+0x25e/0x410 [zfs]
[10271.499435]  zpl_lookup+0xcf/0x230 [zfs]
[10271.499663]  ? capable_wrt_inode_uidgid+0x5f/0x70
[10271.499670]  __lookup_slow+0x88/0x150
[10271.499676]  walk_component+0x145/0x1c0
[10271.499680]  ? zfs_dirty_inode+0x2af/0x330 [zfs]
[10271.499909]  path_lookupat+0x6e/0x1c0
[10271.499913]  filename_lookup+0xcf/0x1d0
[10271.499919]  ? __check_object_size+0x1d/0x30
[10271.499924]  ? strncpy_from_user+0x44/0x150
[10271.499930]  ? getname_flags.part.0+0x4c/0x1b0
[10271.499935]  user_path_at_empty+0x3f/0x60
[10271.499940]  vfs_statx+0x7a/0x130
[10271.499946]  __do_sys_newfstatat+0x36/0x70
[10271.499950]  ? exit_to_user_mode_prepare+0x96/0xb0
[10271.499957]  ? syscall_exit_to_user_mode+0x27/0x50
[10271.499963]  ? do_syscall_64+0x69/0xc0
[10271.499966]  ? exit_to_user_mode_prepare+0x37/0xb0
[10271.499971]  ? syscall_exit_to_user_mode+0x27/0x50
[10271.499975]  ? __x64_sys_newfstatat+0x1c/0x30
[10271.499980]  ? do_syscall_64+0x69/0xc0
[10271.499983]  __x64_sys_newfstatat+0x1c/0x30
[10271.499987]  do_syscall_64+0x5c/0xc0
[10271.499991]  ? do_syscall_64+0x69/0xc0
[10271.499994]  ? syscall_exit_to_user_mode+0x27/0x50
[10271.499998]  ? __x64_sys_close+0x11/0x50
[10271.500003]  ? do_syscall_64+0x69/0xc0
[10271.500006]  entry_SYSCALL_64_after_hwframe+0x61/0xcb
[10271.500012] RIP: 0033:0x7fbbb12afeee
[10271.500016] RSP: 002b:00007ffc2be9a998 EFLAGS: 00000246 ORIG_RAX: 0000000000000106
[10271.500021] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fbbb12afeee
[10271.500024] RDX: 00007ffc2be9aaa0 RSI: 00007ffc2be9bb30 RDI: 00000000ffffff9c
[10271.500027] RBP: 00007ffc2be9aaa0 R08: 00007ffc2be9bb2f R09: 0000000000000000
[10271.500029] R10: 0000000000000100 R11: 0000000000000246 R12: 0000000000000000
[10271.500032] R13: 00007ffc2be9bb30 R14: 000055776e8adfb0 R15: 00007ffc2be9aaa0
[10271.500036]  </TASK>
[10392.326030] INFO: task rsync:847563 blocked for more than 241 seconds.
[10392.326039]       Tainted: P          IOE     5.15.0-48-generic #54-Ubuntu
[10392.326042] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[10392.326045] task:rsync           state:D stack:    0 pid:847563 ppid: 46824 flags:0x00004002
[10392.326051] Call Trace:
[10392.326054]  <TASK>
[10392.326059]  __schedule+0x23d/0x5a0
[10392.326070]  schedule+0x4e/0xc0
[10392.326074]  spl_panic+0xe7/0xe9 [spl]
[10392.326099]  ? __cond_resched+0x1a/0x50
[10392.326103]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[10392.326276]  ? __raw_spin_unlock+0x9/0x10 [zfs]
[10392.326425]  ? dmu_buf_replace_user+0x65/0x80 [zfs]
[10392.326578]  ? dmu_buf_set_user+0x13/0x20 [zfs]
[10392.326729]  ? dmu_buf_set_user_ie+0x15/0x20 [zfs]
[10392.326882]  zfs_znode_sa_init+0xdd/0xf0 [zfs]
[10392.327111]  zfs_znode_alloc+0x101/0x5d0 [zfs]
[10392.327340]  ? arc_space_consume+0x54/0x100 [zfs]
[10392.327479]  ? dbuf_read_bonus+0x102/0x110 [zfs]
[10392.327630]  ? RW_WRITE_HELD+0x9/0x30 [zfs]
[10392.327779]  ? dmu_buf_unlock_parent+0x65/0xa0 [zfs]
[10392.327930]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[10392.328080]  ? dbuf_read_impl.constprop.0+0x352/0x490 [zfs]
[10392.328230]  ? dbuf_rele_and_unlock+0x134/0x540 [zfs]
[10392.328379]  ? __cond_resched+0x1a/0x50
[10392.328384]  ? __cond_resched+0x1a/0x50
[10392.328387]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[10392.328525]  ? __raw_spin_unlock+0x9/0x10 [zfs]
[10392.328662]  ? __cond_resched+0x1a/0x50
[10392.328666]  ? down_read+0x13/0xa0
[10392.328670]  ? wmsum_add+0xe/0x20 [zfs]
[10392.328820]  ? dbuf_read+0x1e6/0x5b0 [zfs]
[10392.328970]  ? __cond_resched+0x1a/0x50
[10392.328974]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[10392.329142]  ? __raw_spin_unlock+0x9/0x10 [zfs]
[10392.329310]  ? dnode_rele_and_unlock+0x69/0xf0 [zfs]
[10392.329479]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[10392.329635]  ? RW_WRITE_HELD+0x9/0x30 [zfs]
[10392.329819]  ? dmu_object_info_from_dnode+0x92/0xa0 [zfs]
[10392.330064]  zfs_zget+0x239/0x280 [zfs]
[10392.330289]  zfs_dirent_lock+0x424/0x570 [zfs]
[10392.330519]  zfs_dirlook+0x91/0x2d0 [zfs]
[10392.330748]  zfs_lookup+0x25e/0x410 [zfs]
[10392.330977]  zpl_lookup+0xcf/0x230 [zfs]
[10392.331206]  ? capable_wrt_inode_uidgid+0x5f/0x70
[10392.331213]  __lookup_slow+0x88/0x150
[10392.331219]  walk_component+0x145/0x1c0
[10392.331223]  ? zfs_dirty_inode+0x2af/0x330 [zfs]
[10392.331451]  path_lookupat+0x6e/0x1c0
[10392.331456]  filename_lookup+0xcf/0x1d0
[10392.331461]  ? __check_object_size+0x1d/0x30
[10392.331466]  ? strncpy_from_user+0x44/0x150
[10392.331473]  ? getname_flags.part.0+0x4c/0x1b0
[10392.331478]  user_path_at_empty+0x3f/0x60
[10392.331482]  vfs_statx+0x7a/0x130
[10392.331488]  __do_sys_newfstatat+0x36/0x70
[10392.331493]  ? exit_to_user_mode_prepare+0x96/0xb0
[10392.331499]  ? syscall_exit_to_user_mode+0x27/0x50
[10392.331505]  ? do_syscall_64+0x69/0xc0
[10392.331509]  ? exit_to_user_mode_prepare+0x37/0xb0
[10392.331513]  ? syscall_exit_to_user_mode+0x27/0x50
[10392.331518]  ? __x64_sys_newfstatat+0x1c/0x30
[10392.331522]  ? do_syscall_64+0x69/0xc0
[10392.331525]  __x64_sys_newfstatat+0x1c/0x30
[10392.331530]  do_syscall_64+0x5c/0xc0
[10392.331533]  ? do_syscall_64+0x69/0xc0
[10392.331536]  ? syscall_exit_to_user_mode+0x27/0x50
[10392.331540]  ? __x64_sys_close+0x11/0x50
[10392.331545]  ? do_syscall_64+0x69/0xc0
[10392.331549]  entry_SYSCALL_64_after_hwframe+0x61/0xcb
[10392.331555] RIP: 0033:0x7fbbb12afeee
[10392.331559] RSP: 002b:00007ffc2be9a998 EFLAGS: 00000246 ORIG_RAX: 0000000000000106
[10392.331563] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fbbb12afeee
[10392.331566] RDX: 00007ffc2be9aaa0 RSI: 00007ffc2be9bb30 RDI: 00000000ffffff9c
[10392.331569] RBP: 00007ffc2be9aaa0 R08: 00007ffc2be9bb2f R09: 0000000000000000
[10392.331571] R10: 0000000000000100 R11: 0000000000000246 R12: 0000000000000000
[10392.331574] R13: 00007ffc2be9bb30 R14: 000055776e8adfb0 R15: 00007ffc2be9aaa0
512yottabytes commented 2 years ago

Just tried to cat the file I've created yesterday using zfs send pool1/dataset1 > /home/user/file.zfs: cat /home/user/file.zfs | zfs recv -v -u -s pool2/dataset2 and also got panic:

[ 3955.769256] Process accounting resumed
[ 7336.787010] VERIFY3(sa.sa_magic == SA_MAGIC) failed (2531917562 == 3100762)
[ 7336.787020] PANIC at zfs_quota.c:89:zpl_get_file_info()
[ 7336.787024] Showing stack for process 19920
[ 7336.787028] CPU: 1 PID: 19920 Comm: dp_sync_taskq Tainted: P          IOE     5.15.0-48-generic #54-Ubuntu
[ 7336.787033] Hardware name: Hewlett-Packard HP Z600 Workstation/0AE8h, BIOS 786G4 v03.60 02/24/2016

... And so on...

512yottabytes commented 2 years ago

Also have errors in rsync and same panic trying to:

zfs mount pool1/dataset1 # (to /mnt/1)
zfs mount pool2/dataset2 # (to /mnt/2)
rsync -avpxHES /mnt/1/. /mnt/2/.

zpool scrub pool1 and zpool scrub pool2 finished successfully without any error.

ryao commented 2 years ago

It looks like something corrupted a metadata buffer before it was checksummed and written to disk.

@512yottabytes Would you provide the backtrace that you get when this happens with rsync?

512yottabytes commented 2 years ago

Hello, @ryao dmesg:

[  359.207437] Process accounting resumed
[ 1569.423938] perf: interrupt took too long (2518 > 2500), lowering kernel.perf_event_max_sample_rate to 79250
[ 1755.729478] perf: interrupt took too long (3428 > 3147), lowering kernel.perf_event_max_sample_rate to 58250
[ 1952.858845] perf: interrupt took too long (4307 > 4285), lowering kernel.perf_event_max_sample_rate to 46250
[ 2157.058386] perf: interrupt took too long (5403 > 5383), lowering kernel.perf_event_max_sample_rate to 37000
[ 2402.056405] perf: interrupt took too long (6845 > 6753), lowering kernel.perf_event_max_sample_rate to 29000
[ 3015.416455] perf: interrupt took too long (8892 > 8556), lowering kernel.perf_event_max_sample_rate to 22250
[ 3356.060044] perf: interrupt took too long (11189 > 11115), lowering kernel.perf_event_max_sample_rate to 17750
[ 3960.051537] Process accounting resumed
[10079.546729] VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed
[10079.546740] PANIC at zfs_znode.c:339:zfs_znode_sa_init()
[10079.546743] Showing stack for process 847563
[10079.546747] CPU: 9 PID: 847563 Comm: rsync Tainted: P          IOE     5.15.0-48-generic #54-Ubuntu
[10079.546751] Hardware name: Hewlett-Packard HP Z600 Workstation/0AE8h, BIOS 786G4 v03.60 02/24/2016
[10079.546754] Call Trace:
[10079.546758]  <TASK>
[10079.546762]  show_stack+0x52/0x5c
[10079.546771]  dump_stack_lvl+0x4a/0x63
[10079.546781]  dump_stack+0x10/0x16
[10079.546785]  spl_dumpstack+0x29/0x2f [spl]
[10079.546806]  spl_panic+0xd1/0xe9 [spl]
[10079.546826]  ? __cond_resched+0x1a/0x50
[10079.546832]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[10079.546999]  ? __raw_spin_unlock+0x9/0x10 [zfs]
[10079.547148]  ? dmu_buf_replace_user+0x65/0x80 [zfs]
[10079.547300]  ? dmu_buf_set_user+0x13/0x20 [zfs]
[10079.547452]  ? dmu_buf_set_user_ie+0x15/0x20 [zfs]
[10079.547604]  zfs_znode_sa_init+0xdd/0xf0 [zfs]
[10079.547834]  zfs_znode_alloc+0x101/0x5d0 [zfs]
[10079.548063]  ? arc_space_consume+0x54/0x100 [zfs]
[10079.548203]  ? dbuf_read_bonus+0x102/0x110 [zfs]
[10079.548353]  ? RW_WRITE_HELD+0x9/0x30 [zfs]
[10079.548502]  ? dmu_buf_unlock_parent+0x65/0xa0 [zfs]
[10079.548654]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[10079.548803]  ? dbuf_read_impl.constprop.0+0x352/0x490 [zfs]
[10079.548953]  ? dbuf_rele_and_unlock+0x134/0x540 [zfs]
[10079.549103]  ? __cond_resched+0x1a/0x50
[10079.549107]  ? __cond_resched+0x1a/0x50
[10079.549111]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[10079.549248]  ? __raw_spin_unlock+0x9/0x10 [zfs]
[10079.549385]  ? __cond_resched+0x1a/0x50
[10079.549389]  ? down_read+0x13/0xa0
[10079.549394]  ? wmsum_add+0xe/0x20 [zfs]
[10079.549543]  ? dbuf_read+0x1e6/0x5b0 [zfs]
[10079.549693]  ? __cond_resched+0x1a/0x50
[10079.549697]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[10079.549865]  ? __raw_spin_unlock+0x9/0x10 [zfs]
[10079.550033]  ? dnode_rele_and_unlock+0x69/0xf0 [zfs]
[10079.550202]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[10079.550358]  ? RW_WRITE_HELD+0x9/0x30 [zfs]
[10079.550513]  ? dmu_object_info_from_dnode+0x92/0xa0 [zfs]
[10079.550671]  zfs_zget+0x239/0x280 [zfs]
[10079.550900]  zfs_dirent_lock+0x424/0x570 [zfs]
[10079.551129]  zfs_dirlook+0x91/0x2d0 [zfs]
[10079.551357]  zfs_lookup+0x25e/0x410 [zfs]
[10079.551586]  zpl_lookup+0xcf/0x230 [zfs]
[10079.551814]  ? capable_wrt_inode_uidgid+0x5f/0x70
[10079.551822]  __lookup_slow+0x88/0x150
[10079.551829]  walk_component+0x145/0x1c0
[10079.551833]  ? zfs_dirty_inode+0x2af/0x330 [zfs]
[10079.552061]  path_lookupat+0x6e/0x1c0
[10079.552066]  filename_lookup+0xcf/0x1d0
[10079.552071]  ? __check_object_size+0x1d/0x30
[10079.552077]  ? strncpy_from_user+0x44/0x150
[10079.552084]  ? getname_flags.part.0+0x4c/0x1b0
[10079.552089]  user_path_at_empty+0x3f/0x60
[10079.552093]  vfs_statx+0x7a/0x130
[10079.552099]  __do_sys_newfstatat+0x36/0x70
[10079.552103]  ? exit_to_user_mode_prepare+0x96/0xb0
[10079.552111]  ? syscall_exit_to_user_mode+0x27/0x50
[10079.552118]  ? do_syscall_64+0x69/0xc0
[10079.552121]  ? exit_to_user_mode_prepare+0x37/0xb0
[10079.552126]  ? syscall_exit_to_user_mode+0x27/0x50
[10079.552130]  ? __x64_sys_newfstatat+0x1c/0x30
[10079.552134]  ? do_syscall_64+0x69/0xc0
[10079.552137]  __x64_sys_newfstatat+0x1c/0x30
[10079.552142]  do_syscall_64+0x5c/0xc0
[10079.552145]  ? do_syscall_64+0x69/0xc0
[10079.552148]  ? syscall_exit_to_user_mode+0x27/0x50
[10079.552152]  ? __x64_sys_close+0x11/0x50
[10079.552157]  ? do_syscall_64+0x69/0xc0
[10079.552160]  entry_SYSCALL_64_after_hwframe+0x61/0xcb
[10079.552166] RIP: 0033:0x7fbbb12afeee
[10079.552171] Code: 48 89 f2 b9 00 01 00 00 48 89 fe bf 9c ff ff ff e9 07 00 00 00 0f 1f 80 00 00 00 00 f3 0f 1e fa 41 89 ca b8 06 01 00 00 0f 05 <3d> 00 f0 ff ff 77 0b 31 c0 c3 0f 1f 84 00 00 00 00 00 48 8b 15 09
[10079.552174] RSP: 002b:00007ffc2be9a998 EFLAGS: 00000246 ORIG_RAX: 0000000000000106
[10079.552179] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fbbb12afeee
[10079.552182] RDX: 00007ffc2be9aaa0 RSI: 00007ffc2be9bb30 RDI: 00000000ffffff9c
[10079.552185] RBP: 00007ffc2be9aaa0 R08: 00007ffc2be9bb2f R09: 0000000000000000
[10079.552187] R10: 0000000000000100 R11: 0000000000000246 R12: 0000000000000000
[10079.552190] R13: 00007ffc2be9bb30 R14: 000055776e8adfb0 R15: 00007ffc2be9aaa0
[10079.552194]  </TASK>
[10271.494666] INFO: task rsync:847563 blocked for more than 120 seconds.
[10271.494676]       Tainted: P          IOE     5.15.0-48-generic #54-Ubuntu
[10271.494679] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[10271.494682] task:rsync           state:D stack:    0 pid:847563 ppid: 46824 flags:0x00004002
[10271.494689] Call Trace:
[10271.494692]  <TASK>
[10271.494696]  __schedule+0x23d/0x5a0
[10271.494708]  schedule+0x4e/0xc0
[10271.494712]  spl_panic+0xe7/0xe9 [spl]
[10271.494737]  ? __cond_resched+0x1a/0x50
[10271.494741]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[10271.494915]  ? __raw_spin_unlock+0x9/0x10 [zfs]
[10271.495064]  ? dmu_buf_replace_user+0x65/0x80 [zfs]
[10271.495216]  ? dmu_buf_set_user+0x13/0x20 [zfs]
[10271.495368]  ? dmu_buf_set_user_ie+0x15/0x20 [zfs]
[10271.495521]  zfs_znode_sa_init+0xdd/0xf0 [zfs]
[10271.495750]  zfs_znode_alloc+0x101/0x5d0 [zfs]
[10271.495979]  ? arc_space_consume+0x54/0x100 [zfs]
[10271.496119]  ? dbuf_read_bonus+0x102/0x110 [zfs]
[10271.496269]  ? RW_WRITE_HELD+0x9/0x30 [zfs]
[10271.496418]  ? dmu_buf_unlock_parent+0x65/0xa0 [zfs]
[10271.496570]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[10271.496719]  ? dbuf_read_impl.constprop.0+0x352/0x490 [zfs]
[10271.496869]  ? dbuf_rele_and_unlock+0x134/0x540 [zfs]
[10271.497019]  ? __cond_resched+0x1a/0x50
[10271.497024]  ? __cond_resched+0x1a/0x50
[10271.497027]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[10271.497165]  ? __raw_spin_unlock+0x9/0x10 [zfs]
[10271.497302]  ? __cond_resched+0x1a/0x50
[10271.497306]  ? down_read+0x13/0xa0
[10271.497311]  ? wmsum_add+0xe/0x20 [zfs]
[10271.497460]  ? dbuf_read+0x1e6/0x5b0 [zfs]
[10271.497611]  ? __cond_resched+0x1a/0x50
[10271.497615]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[10271.497782]  ? __raw_spin_unlock+0x9/0x10 [zfs]
[10271.497950]  ? dnode_rele_and_unlock+0x69/0xf0 [zfs]
[10271.498119]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[10271.498276]  ? RW_WRITE_HELD+0x9/0x30 [zfs]
[10271.498461]  ? dmu_object_info_from_dnode+0x92/0xa0 [zfs]
[10271.498703]  zfs_zget+0x239/0x280 [zfs]
[10271.498840]  zfs_dirent_lock+0x424/0x570 [zfs]
[10271.498978]  zfs_dirlook+0x91/0x2d0 [zfs]
[10271.499206]  zfs_lookup+0x25e/0x410 [zfs]
[10271.499435]  zpl_lookup+0xcf/0x230 [zfs]
[10271.499663]  ? capable_wrt_inode_uidgid+0x5f/0x70
[10271.499670]  __lookup_slow+0x88/0x150
[10271.499676]  walk_component+0x145/0x1c0
[10271.499680]  ? zfs_dirty_inode+0x2af/0x330 [zfs]
[10271.499909]  path_lookupat+0x6e/0x1c0
[10271.499913]  filename_lookup+0xcf/0x1d0
[10271.499919]  ? __check_object_size+0x1d/0x30
[10271.499924]  ? strncpy_from_user+0x44/0x150
[10271.499930]  ? getname_flags.part.0+0x4c/0x1b0
[10271.499935]  user_path_at_empty+0x3f/0x60
[10271.499940]  vfs_statx+0x7a/0x130
[10271.499946]  __do_sys_newfstatat+0x36/0x70
[10271.499950]  ? exit_to_user_mode_prepare+0x96/0xb0
[10271.499957]  ? syscall_exit_to_user_mode+0x27/0x50
[10271.499963]  ? do_syscall_64+0x69/0xc0
[10271.499966]  ? exit_to_user_mode_prepare+0x37/0xb0
[10271.499971]  ? syscall_exit_to_user_mode+0x27/0x50
[10271.499975]  ? __x64_sys_newfstatat+0x1c/0x30
[10271.499980]  ? do_syscall_64+0x69/0xc0
[10271.499983]  __x64_sys_newfstatat+0x1c/0x30
[10271.499987]  do_syscall_64+0x5c/0xc0
[10271.499991]  ? do_syscall_64+0x69/0xc0
[10271.499994]  ? syscall_exit_to_user_mode+0x27/0x50
[10271.499998]  ? __x64_sys_close+0x11/0x50
[10271.500003]  ? do_syscall_64+0x69/0xc0
[10271.500006]  entry_SYSCALL_64_after_hwframe+0x61/0xcb
[10271.500012] RIP: 0033:0x7fbbb12afeee
[10271.500016] RSP: 002b:00007ffc2be9a998 EFLAGS: 00000246 ORIG_RAX: 0000000000000106
[10271.500021] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fbbb12afeee
[10271.500024] RDX: 00007ffc2be9aaa0 RSI: 00007ffc2be9bb30 RDI: 00000000ffffff9c
[10271.500027] RBP: 00007ffc2be9aaa0 R08: 00007ffc2be9bb2f R09: 0000000000000000
[10271.500029] R10: 0000000000000100 R11: 0000000000000246 R12: 0000000000000000
[10271.500032] R13: 00007ffc2be9bb30 R14: 000055776e8adfb0 R15: 00007ffc2be9aaa0
[10271.500036]  </TASK>
[10392.326030] INFO: task rsync:847563 blocked for more than 241 seconds.
[10392.326039]       Tainted: P          IOE     5.15.0-48-generic #54-Ubuntu
[10392.326042] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[10392.326045] task:rsync           state:D stack:    0 pid:847563 ppid: 46824 flags:0x00004002
[10392.326051] Call Trace:
[10392.326054]  <TASK>
[10392.326059]  __schedule+0x23d/0x5a0
[10392.326070]  schedule+0x4e/0xc0
[10392.326074]  spl_panic+0xe7/0xe9 [spl]
[10392.326099]  ? __cond_resched+0x1a/0x50
[10392.326103]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[10392.326276]  ? __raw_spin_unlock+0x9/0x10 [zfs]
[10392.326425]  ? dmu_buf_replace_user+0x65/0x80 [zfs]
[10392.326578]  ? dmu_buf_set_user+0x13/0x20 [zfs]
[10392.326729]  ? dmu_buf_set_user_ie+0x15/0x20 [zfs]
[10392.326882]  zfs_znode_sa_init+0xdd/0xf0 [zfs]
[10392.327111]  zfs_znode_alloc+0x101/0x5d0 [zfs]
[10392.327340]  ? arc_space_consume+0x54/0x100 [zfs]
[10392.327479]  ? dbuf_read_bonus+0x102/0x110 [zfs]
[10392.327630]  ? RW_WRITE_HELD+0x9/0x30 [zfs]
[10392.327779]  ? dmu_buf_unlock_parent+0x65/0xa0 [zfs]
[10392.327930]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[10392.328080]  ? dbuf_read_impl.constprop.0+0x352/0x490 [zfs]
[10392.328230]  ? dbuf_rele_and_unlock+0x134/0x540 [zfs]
[10392.328379]  ? __cond_resched+0x1a/0x50
[10392.328384]  ? __cond_resched+0x1a/0x50
[10392.328387]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[10392.328525]  ? __raw_spin_unlock+0x9/0x10 [zfs]
[10392.328662]  ? __cond_resched+0x1a/0x50
[10392.328666]  ? down_read+0x13/0xa0
[10392.328670]  ? wmsum_add+0xe/0x20 [zfs]
[10392.328820]  ? dbuf_read+0x1e6/0x5b0 [zfs]
[10392.328970]  ? __cond_resched+0x1a/0x50
[10392.328974]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[10392.329142]  ? __raw_spin_unlock+0x9/0x10 [zfs]
[10392.329310]  ? dnode_rele_and_unlock+0x69/0xf0 [zfs]
[10392.329479]  ? do_raw_spin_unlock+0x9/0x10 [zfs]
[10392.329635]  ? RW_WRITE_HELD+0x9/0x30 [zfs]
[10392.329819]  ? dmu_object_info_from_dnode+0x92/0xa0 [zfs]
[10392.330064]  zfs_zget+0x239/0x280 [zfs]
[10392.330289]  zfs_dirent_lock+0x424/0x570 [zfs]
[10392.330519]  zfs_dirlook+0x91/0x2d0 [zfs]
[10392.330748]  zfs_lookup+0x25e/0x410 [zfs]
[10392.330977]  zpl_lookup+0xcf/0x230 [zfs]
[10392.331206]  ? capable_wrt_inode_uidgid+0x5f/0x70
[10392.331213]  __lookup_slow+0x88/0x150
[10392.331219]  walk_component+0x145/0x1c0
[10392.331223]  ? zfs_dirty_inode+0x2af/0x330 [zfs]
[10392.331451]  path_lookupat+0x6e/0x1c0
[10392.331456]  filename_lookup+0xcf/0x1d0
[10392.331461]  ? __check_object_size+0x1d/0x30
[10392.331466]  ? strncpy_from_user+0x44/0x150
[10392.331473]  ? getname_flags.part.0+0x4c/0x1b0
[10392.331478]  user_path_at_empty+0x3f/0x60
[10392.331482]  vfs_statx+0x7a/0x130
[10392.331488]  __do_sys_newfstatat+0x36/0x70
[10392.331493]  ? exit_to_user_mode_prepare+0x96/0xb0
[10392.331499]  ? syscall_exit_to_user_mode+0x27/0x50
[10392.331505]  ? do_syscall_64+0x69/0xc0
[10392.331509]  ? exit_to_user_mode_prepare+0x37/0xb0
[10392.331513]  ? syscall_exit_to_user_mode+0x27/0x50
[10392.331518]  ? __x64_sys_newfstatat+0x1c/0x30
[10392.331522]  ? do_syscall_64+0x69/0xc0
[10392.331525]  __x64_sys_newfstatat+0x1c/0x30
[10392.331530]  do_syscall_64+0x5c/0xc0
[10392.331533]  ? do_syscall_64+0x69/0xc0
[10392.331536]  ? syscall_exit_to_user_mode+0x27/0x50
[10392.331540]  ? __x64_sys_close+0x11/0x50
[10392.331545]  ? do_syscall_64+0x69/0xc0
[10392.331549]  entry_SYSCALL_64_after_hwframe+0x61/0xcb
[10392.331555] RIP: 0033:0x7fbbb12afeee
[10392.331559] RSP: 002b:00007ffc2be9a998 EFLAGS: 00000246 ORIG_RAX: 0000000000000106
[10392.331563] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fbbb12afeee
[10392.331566] RDX: 00007ffc2be9aaa0 RSI: 00007ffc2be9bb30 RDI: 00000000ffffff9c
[10392.331569] RBP: 00007ffc2be9aaa0 R08: 00007ffc2be9bb2f R09: 0000000000000000
[10392.331571] R10: 0000000000000100 R11: 0000000000000246 R12: 0000000000000000
[10392.331574] R13: 00007ffc2be9bb30 R14: 000055776e8adfb0 R15: 00007ffc2be9aaa0
512yottabytes commented 2 years ago

Sending dataset's snapshot as raw via zfs send -w pool1/dataset1@--2022-13-32--25-62-62--snapshot1 | zfs recv pool2/dataset2 works well, without errors. Sending snapshots 2 & 3 zfs send -w pool1/dataset1@--2022-14-33--26-63-63--snapshot2 | zfs recv pool2/dataset3 gives errors in zfs and spl modules, but send/recv ends successfully.

ryao commented 2 years ago

We need to find out what file on your pool is corrupted and restore it from backup (actually, the data is fine, it is just metadata that is corrupt). A raw send would not check the contents, so the corruption just goes from one pool to another that way.

I am busy this week. Could you contact me on Monday at around 1pm in the OpenZFS slack channel? I could try to help you then.

https://openzfs.slack.com/

i8-pi commented 1 year ago

I'm running into a panic in a similar situation. I suspect it may be the same cause

What distribution (with version) you are using

Arch linux (kernel Linux machine-name 6.0.6-arch1-1 #1 SMP PREEMPT_DYNAMIC Sat, 29 Oct 2022 14:08:39 +0000 x86_64 GNU/Linux)

The spl and zfs versions you are using, installation method (repository or manual compilation)

2.1.6 (installed from non-official repo)

Describe the issue you are experiencing

Kernel panic, unkillable zfs process during send/receive

Describe how to reproduce the issue

Similar to the original report, but I'm doing an incremental send

zpool import pool1
zpool import pool2
zfs load-key pool1/dataset1
zfs send  -I 'pool1/dataset1'@'2022-10-03' 'pool1/dataset1'@'2022-11-02' | zfs receive  -s -F 'pool2/dataset1'

Dataset is encrypted. The panic happens if I only load the encryption key on the source and not the destination. It does not panic if the key is loaded on both ends

Including any warning/errors/backtraces from the system logs

dmesg output

[ 1107.811733] VERIFY3(0 == zap_add(mos, dsl_dir_phys(pds)->dd_child_dir_zapobj, name, sizeof (uint64_t), 1, &ddobj, tx)) failed (0 == 17)
[ 1107.811740] PANIC at dsl_dir.c:951:dsl_dir_create_sync()
[ 1107.811742] Showing stack for process 2165
[ 1107.811743] CPU: 0 PID: 2165 Comm: txg_sync Tainted: P           OE      6.0.6-arch1-1 #1 a46cc4b882cfc11c3bbb09d6a0fab3dcad53b5c2
[ 1107.811747] Hardware name: Dell Inc. Inspiron 7520/0PXH02, BIOS A11 02/20/2014
[ 1107.811748] Call Trace:
[ 1107.811752]  <TASK>
[ 1107.811755]  dump_stack_lvl+0x48/0x60
[ 1107.811762]  spl_panic+0xf4/0x10c [spl 4584b05d58613f22c91dbd8e0bc55d01674607e6]
[ 1107.811776]  dsl_dir_create_sync+0x25b/0x350 [zfs dd8976dc3cdb4b6bb09fa0a98dc75c1b67476003]
[ 1107.811885]  dsl_dataset_create_sync+0x5d/0x3a0 [zfs dd8976dc3cdb4b6bb09fa0a98dc75c1b67476003]
[ 1107.811999]  dmu_recv_begin_sync+0x390/0x9f0 [zfs dd8976dc3cdb4b6bb09fa0a98dc75c1b67476003]
[ 1107.812070]  dsl_sync_task_sync+0xac/0xf0 [zfs dd8976dc3cdb4b6bb09fa0a98dc75c1b67476003]
[ 1107.812151]  dsl_pool_sync+0x3f9/0x510 [zfs dd8976dc3cdb4b6bb09fa0a98dc75c1b67476003]
[ 1107.812230]  spa_sync+0x569/0xf90 [zfs dd8976dc3cdb4b6bb09fa0a98dc75c1b67476003]
[ 1107.812325]  ? spa_txg_history_init_io+0x117/0x120 [zfs dd8976dc3cdb4b6bb09fa0a98dc75c1b67476003]
[ 1107.812451]  txg_sync_thread+0x22b/0x3f0 [zfs dd8976dc3cdb4b6bb09fa0a98dc75c1b67476003]
[ 1107.812576]  ? txg_fini+0x270/0x270 [zfs dd8976dc3cdb4b6bb09fa0a98dc75c1b67476003]
[ 1107.812700]  ? __thread_exit+0x20/0x20 [spl 4584b05d58613f22c91dbd8e0bc55d01674607e6]
[ 1107.812712]  thread_generic_wrapper+0x5e/0x70 [spl 4584b05d58613f22c91dbd8e0bc55d01674607e6]
[ 1107.812725]  kthread+0xde/0x110
[ 1107.812730]  ? kthread_complete_and_exit+0x20/0x20
[ 1107.812734]  ret_from_fork+0x22/0x30
[ 1107.812740]  </TASK>
AttilaFueloep commented 1 year ago

@i8-pi I think your issue isn't related to this one but a duplicate of #13598 and #14055.

i8-pi commented 1 year ago

I think you're correct. Sorry for the noise