UBSAN: shift-out-of-bounds spew

adamdmoss commented 1 year ago

System information

Type	Version/Name
Distribution Name	Ubuntu
Distribution Version	22.04.2
Kernel Version	Linux version 5.19.0-1022-lowlatency (buildd@lcy02-amd64-044) (x86_64-linux-gnu-gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0, GNU ld (GNU Binutils for Ubuntu) 2.38) #23~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Wed Mar 29 15:33:15 UTC 2
Architecture	x64
OpenZFS Version	135d9a9048e3716c755373182720d0eba170285f / master

Describe the problem you're observing

kernel spew when mounting(?) zfs filesystems:

UBSAN: shift-out-of-bounds in /var/lib/dkms/zfs/2.1.99/build/module/zfs/zio.c:5008:28
[  390.361064] shift exponent -5 is negative

Full spew:

[  390.361053] ================================================================================
[  390.361060] UBSAN: shift-out-of-bounds in /var/lib/dkms/zfs/2.1.99/build/module/zfs/zio.c:5008:28
[  390.361064] shift exponent -5 is negative
[  390.361066] CPU: 5 PID: 10457 Comm: z_rd_int_0 Tainted: P           OE     5.19.0-1022-lowlatency #23~22.04.1-Ubuntu
[  390.361069] Hardware name: Gigabyte Technology Co., Ltd. Z68MA-D2H-B3/Z68MA-D2H-B3, BIOS F10 02/23/2012
[  390.361071] Call Trace:
[  390.361073]  <TASK>
[  390.361076]  show_stack+0x52/0x69
[  390.361082]  dump_stack_lvl+0x49/0x6d
[  390.361087]  dump_stack+0x10/0x18
[  390.361089]  ubsan_epilogue+0x9/0x43
[  390.361093]  __ubsan_handle_shift_out_of_bounds.cold+0x61/0xef
[  390.361098]  ? __sbitmap_get_word+0x36/0x90
[  390.361104]  zbookmark_compare.cold+0x20/0x66 [zfs]
[  390.361278]  zbookmark_subtree_completed+0x60/0x90 [zfs]
[  390.361423]  dsl_scan_check_prefetch_resume+0x6b/0xa0 [zfs]
[  390.361565]  ? abd_fletcher_4_fini+0x58/0x70 [zfs]
[  390.361680]  ? abd_fletcher_4_native+0x92/0xd0 [zfs]
[  390.361824]  ? dma_map_page_attrs+0x35/0x90
[  390.361829]  ? ktime_get+0x43/0xc0
[  390.361832]  ? __rq_qos_issue+0x26/0x50
[  390.361837]  ? blk_mq_start_request+0x3d/0x150
[  390.361842]  ? nvme_prep_rq.part.0+0xac/0x120 [nvme]
[  390.361847]  ? nvme_queue_rqs+0x1e0/0x290 [nvme]
[  390.361852]  ? aggsum_add+0x1af/0x1d0 [zfs]
[  390.361976]  ? kmem_cache_alloc+0x1b3/0x340
[  390.361979]  ? spl_kmem_cache_alloc+0x121/0x790 [spl]
[  390.361991]  ? buf_cons+0x65/0x80 [zfs]
[  390.362115]  ? arc_buf_fill+0x983/0xd90 [zfs]
[  390.362240]  dsl_scan_prefetch+0x8e/0x280 [zfs]
[  390.362382]  dsl_scan_prefetch_cb+0x158/0x310 [zfs]
[  390.362525]  arc_read_done+0x2d9/0x590 [zfs]
[  390.362650]  l2arc_read_done+0x7cf/0xae0 [zfs]
[  390.362768]  ? zio_wait_for_children+0xb2/0x140 [zfs]
[  390.362905]  zio_done+0x412/0x12a0 [zfs]
[  390.363038]  zio_execute+0x94/0x170 [zfs]
[  390.363178]  taskq_thread+0x27a/0x490 [spl]
[  390.363190]  ? wake_up_q+0xa0/0xa0
[  390.363197]  ? zio_gang_tree_free+0x70/0x70 [zfs]
[  390.363339]  ? taskq_thread_spawn+0x60/0x60 [spl]
[  390.363349]  kthread+0xeb/0x120
[  390.363353]  ? kthread_complete_and_exit+0x20/0x20
[  390.363357]  ret_from_fork+0x1f/0x30
[  390.363362]  </TASK>
[  390.363408] ================================================================================

The spew points to this code in zio.c:

int
zbookmark_compare(uint16_t dbss1, uint8_t ibs1, uint16_t dbss2, uint8_t ibs2,
    const zbookmark_phys_t *zb1, const zbookmark_phys_t *zb2)
{
    /*
     * These variables represent the "equivalent" values for the zbookmark,
     * after converting zbookmarks inside the meta dnode to their
     * normal-object equivalents.
     */
    uint64_t zb1obj, zb2obj;
    uint64_t zb1L0, zb2L0;
    uint64_t zb1level, zb2level;

    if (zb1->zb_object == zb2->zb_object &&
        zb1->zb_level == zb2->zb_level &&
        zb1->zb_blkid == zb2->zb_blkid)
        return (0);

    IMPLY(zb1->zb_level > 0, ibs1 >= SPA_MINBLOCKSHIFT);
    IMPLY(zb2->zb_level > 0, ibs2 >= SPA_MINBLOCKSHIFT);

    /*
     * BP_SPANB calculates the span in blocks.
     */
    zb1L0 = (zb1->zb_blkid) * BP_SPANB(ibs1, zb1->zb_level);
    zb2L0 = (zb2->zb_blkid) * BP_SPANB(ibs2, zb2->zb_level);

... and that final line is line 5009: *zb2L0 = (zb2->zb_blkid) BP_SPANB(ibs2, zb2->zb_level);**

BP_SPANB is

#define BP_SPANB(indblkshift, level) \
    (((uint64_t)1) << ((level) * ((indblkshift) - SPA_BLKPTRSHIFT)))

spa.h says: #define SPA_BLKPTRSHIFT 7

so level * (indblkshift - 7) == -5

I think those IMPLYs are trying to catch such a situation but I'm not on a debug kernel... :)

I don't see anything clearly related in recent git history so I guess my pool has some interesting corruption, but I'm reporting it just in case.

Describe how to reproduce the problem

Unsure.

ryao commented 1 year ago

zb1->zb_level and zb2->zb_level are both 0. This is either an empty file or an embedded data, since blocks in regular files will always have a level greater than 0.

There is likely enough information here to fix this after studying the code in more detail.

adamdmoss commented 1 year ago

FWIW I don't think I can repro this again unless my pool corrupts(?) in the same way - I rolled back one dataset (that a scrub was complaining about) to an earlier snap and everything is smooth again! :grimacing: Still, I guess no sort of on-disk corruption should end up with UB inside the kernel. Perhaps very wishful thinking. :)

devZer0 commented 10 months ago

i got this one with some other kernel trace


[220073.761054] ================================================================================
[20073.761433] UBSAN: shift-out-of-bounds in /home/tom/sources/pve/pve-kernel/proxmox-kernel-6.5.11/modules/pkg-zfs/module/zfs/zio.c:5050:28
[20073.761832] shift exponent -10 is negative
[20073.762141] CPU: 3 PID: 665554 Comm: z_rd_int_1 Tainted: P           O       6.5.11-7-pve #1
[20073.762148] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.1-0-g3208b098f51a-prebuilt.qemu.org 04/01/2014
[20073.762151] Call Trace:
[20073.762165]  <TASK>
[20073.762190]  dump_stack_lvl+0x48/0x70
[20073.762237]  dump_stack+0x10/0x20
[20073.762244]  __ubsan_handle_shift_out_of_bounds+0x1ac/0x360
[20073.762255]  zbookmark_compare.cold+0x20/0x66 [zfs]
[20073.763336]  zbookmark_subtree_completed+0x60/0x90 [zfs]
[20073.763593]  dsl_scan_check_prefetch_resume+0x82/0xc0 [zfs]
[20073.763919]  dsl_scan_prefetch+0x96/0x290 [zfs]
[20073.764298]  dsl_scan_prefetch_cb+0x15f/0x350 [zfs]
[20073.764610]  arc_read_done+0x2ad/0x4b0 [zfs]
[20073.764900]  l2arc_read_done+0x94a/0xaa0 [zfs]
[20073.765180]  ? vdev_queue_io_to_issue+0x4a4/0xce0 [zfs]
[20073.765515]  zio_done+0x28c/0x10b0 [zfs]
[20073.765826]  ? _raw_spin_unlock+0xe/0x40
[20073.765835]  ? zio_wait_for_children+0x91/0xd0 [zfs]
[20073.766138]  zio_execute+0x8b/0x130 [zfs]
[20073.766433]  ? _raw_spin_unlock_irqrestore+0x11/0x60
[20073.766440]  taskq_thread+0x282/0x490 [spl]
[20073.766494]  ? __pfx_default_wake_function+0x10/0x10
[20073.766505]  ? __pfx_zio_execute+0x10/0x10 [zfs]
[20073.766800]  ? __pfx_taskq_thread+0x10/0x10 [spl]
[20073.766823]  kthread+0xf2/0x120
[20073.766841]  ? __pfx_kthread+0x10/0x10
[20073.766847]  ret_from_fork+0x47/0x70
[20073.766874]  ? __pfx_kthread+0x10/0x10
[20073.766880]  ret_from_fork_asm+0x1b/0x30
[20073.766894]  </TASK>
[20073.766923] ================================================================================
[641249.027057] INFO: task tokio-runtime-w:857942 blocked for more than 120 seconds.
[641249.027551]       Tainted: P           O       6.5.11-7-pve #1
[641249.027898] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[641249.028232] task:tokio-runtime-w state:D stack:0     pid:857942 ppid:1      flags:0x00004002
[641249.028241] Call Trace:
[641249.028261]  <TASK>
[641249.028294]  __schedule+0x3fd/0x1450
[641249.028304]  ? __schedule+0x405/0x1450
[641249.028310]  ? _raw_spin_unlock_irq+0xe/0x50
[641249.028318]  schedule+0x63/0x110
[641249.028323]  wb_wait_for_completion+0x89/0xc0
[641249.028342]  ? __pfx_autoremove_wake_function+0x10/0x10
[641249.028357]  __writeback_inodes_sb_nr+0x9d/0xd0
[641249.028383]  writeback_inodes_sb+0x3c/0x60
[641249.028404]  sync_filesystem+0x3d/0xb0
[641249.028415]  __x64_sys_syncfs+0x49/0xb0
[641249.028421]  do_syscall_64+0x5b/0x90
[641249.028428]  ? exit_to_user_mode_prepare+0xa5/0x190
[641249.028443]  ? syscall_exit_to_user_mode+0x37/0x60
[641249.028450]  ? do_syscall_64+0x67/0x90
[641249.028455]  ? exc_page_fault+0x94/0x1b0
[641249.028461]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[641249.028484] RIP: 0033:0x7fe8d5d1db57
[641249.028545] RSP: 002b:00007fe7b548d2a8 EFLAGS: 00000202 ORIG_RAX: 0000000000000132
[641249.028551] RAX: ffffffffffffffda RBX: 00007fe7b548d2f8 RCX: 00007fe8d5d1db57
[641249.028554] RDX: 00007fef568dce4c RSI: 0000000000000007 RDI: 000000000000001c
[641249.028558] RBP: 000000000000001c R08: 0000000000000007 R09: 00007fe8a800cae0
[641249.028560] R10: 6362d3f42356e84f R11: 0000000000000202 R12: 0000000000000001
[641249.028563] R13: 00007fe85c03af90 R14: 000000000000001c R15: 00007fe8a803c120
[641249.028570]  </TASK>
[641249.028584] INFO: task tokio-runtime-w:858018 blocked for more than 120 seconds.
[641249.028927]       Tainted: P           O       6.5.11-7-pve #1
[641249.029260] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[641249.029585] task:tokio-runtime-w state:D stack:0     pid:858018 ppid:1      flags:0x00004002
[641249.029592] Call Trace:
[641249.029594]  <TASK>
[641249.029598]  __schedule+0x3fd/0x1450
[641249.029604]  ? __queue_delayed_work+0x83/0xf0
[641249.029622]  ? _raw_spin_unlock_irq+0xe/0x50
[641249.029629]  schedule+0x63/0x110
[641249.029634]  wb_wait_for_completion+0x89/0xc0
[641249.029639]  ? __pfx_autoremove_wake_function+0x10/0x10
[641249.029645]  sync_inodes_sb+0xd6/0x2c0
[641249.029652]  sync_filesystem+0x70/0xb0
[641249.029662]  __x64_sys_syncfs+0x49/0xb0
[641249.029696]  do_syscall_64+0x5b/0x90
[641249.029701]  ? do_syscall_64+0x67/0x90
[641249.029706]  ? syscall_exit_to_user_mode+0x37/0x60
[641249.029712]  ? do_syscall_64+0x67/0x90
[641249.029717]  ? syscall_exit_to_user_mode+0x37/0x60
[641249.029722]  ? do_syscall_64+0x67/0x90
[641249.029727]  ? exc_page_fault+0x94/0x1b0
[641249.029733]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[641249.029739] RIP: 0033:0x7fe8d5d1db57
[641249.029749] RSP: 002b:00007fe7972df2a8 EFLAGS: 00000202 ORIG_RAX: 0000000000000132
[641249.029754] RAX: ffffffffffffffda RBX: 00007fe7972df2f8 RCX: 00007fe8d5d1db57
[641249.029757] RDX: 00007fef867f3d82 RSI: 0000000000000007 RDI: 000000000000003d
[641249.029759] RBP: 000000000000003d R08: 0000000000000007 R09: 00007fe7d00326f0
[641249.029762] R10: 6362d3f42356e84f R11: 0000000000000202 R12: 0000000000000001
[641249.029782] R13: 00007fe8c806c910 R14: 000000000000001c R15: 00007fe7d00fd500
[641249.029789]  </TASK>

devZer0 commented 10 months ago

openzfs / zfs