kdave / btrfs-progs

Development of userspace BTRFS tools
GNU General Public License v2.0
559 stars 243 forks source link

Suspicious 0 sized vmalloc followed by "zoned: failed to read zone xxxx" error #915

Open karuboniru opened 1 week ago

karuboniru commented 1 week ago

Dmesg: https://gist.github.com/karuboniru/b9cf4c2d5ba290e3c4f7bd7091ca67fa

Happens on 6.11.3-200.fc40.x86_64, when writing ~2TB of data to a newly formatted HC650 disk. No further error is detected after seeing the error. The write seems to have succeed.

Might be related to https://github.com/kdave/btrfs-progs/issues/779 but the volume did not go RO afterwards

[ 8336.219294] kworker/u17:5: vmalloc error: size 0, failed to allocate pages, mode:0x10dc2(GFP_KERNEL|__GFP_HIGHMEM|__GFP_NORETRY|__GFP_ZERO), nodemask=(null),cpuset=/,mems_allowed=0
[ 8336.219332] CPU: 0 UID: 0 PID: 5765 Comm: kworker/u17:5 Not tainted 6.11.3-200.fc40.x86_64 #1
[ 8336.219339] Hardware name: To be filled by O.E.M. To be filled by O.E.M./Aptio CRB, BIOS 5.6.5 12/20/2018
[ 8336.219343] Workqueue: events_unbound btrfs_async_reclaim_data_space
[ 8336.219356] Call Trace:
[ 8336.219363]  <TASK>
[ 8336.219371]  dump_stack_lvl+0x5d/0x80
[ 8336.219385]  warn_alloc+0x164/0x190
[ 8336.219393]  ? alloc_pages_mpol_noprof+0xd7/0x1e0
[ 8336.219401]  __vmalloc_node_range_noprof+0x7d6/0x870
[ 8336.219408]  ? sd_zbc_report_zones+0xde/0x5a0
[ 8336.219417]  __vmalloc_node_noprof+0x4e/0x70
[ 8336.219422]  ? sd_zbc_report_zones+0xde/0x5a0
[ 8336.219426]  sd_zbc_report_zones+0xde/0x5a0
[ 8336.219432]  ? btrfs_verify_level_key+0xcb/0x130
[ 8336.219437]  ? __pfx_copy_zone_info_cb+0x10/0x10
[ 8336.219443]  ? __slab_free+0xdf/0x2e0
[ 8336.219450]  btrfs_get_dev_zones+0xc1/0x1d0
[ 8336.219455]  btrfs_load_block_group_zone_info+0x2f1/0xdd0
[ 8336.219461]  ? kmem_cache_alloc_noprof+0x12d/0x2f0
[ 8336.219468]  ? __kmalloc_cache_noprof+0x138/0x310
[ 8336.219474]  btrfs_make_block_group+0xda/0x2b0
[ 8336.219481]  btrfs_create_chunk+0x967/0xf10
[ 8336.219492]  btrfs_chunk_alloc+0x174/0x500
[ 8336.219499]  flush_space+0x122/0x570
[ 8336.219505]  ? finish_task_switch.isra.0+0x99/0x2e0
[ 8336.219513]  ? __schedule+0x434/0x1660
[ 8336.219521]  btrfs_async_reclaim_data_space+0x67/0x190
[ 8336.219528]  process_one_work+0x179/0x330
[ 8336.219536]  worker_thread+0x252/0x390
[ 8336.219543]  ? __pfx_worker_thread+0x10/0x10
[ 8336.219548]  kthread+0xd2/0x100
[ 8336.219554]  ? __pfx_kthread+0x10/0x10
[ 8336.219558]  ret_from_fork+0x34/0x50
[ 8336.219565]  ? __pfx_kthread+0x10/0x10
[ 8336.219570]  ret_from_fork_asm+0x1a/0x30
[ 8336.219578]  </TASK>
[ 8336.219613] Mem-Info:
[ 8336.219617] active_anon:9552 inactive_anon:3461 isolated_anon:0
                active_file:106638 inactive_file:663747 isolated_file:0
                unevictable:4 dirty:70791 writeback:54594
                slab_reclaimable:50321 slab_unreclaimable:61334
                mapped:7852 shmem:21 pagetables:2454
                sec_pagetables:0 bounce:0
                kernel_misc_reclaimable:0
                free:41175 free_pcp:191 free_cma:0
[ 8336.219628] Node 0 active_anon:38208kB inactive_anon:13844kB active_file:426552kB inactive_file:2654988kB unevictable:16kB isolated(anon):0kB isolated(file):0kB mapped:31408kB dirty:283164kB writeback:218376kB shmem:84kB shmem_thp:0kB shmem_pmdmapped:0kB anon_thp:0kB writeback_tmp:0kB kernel_stack:5600kB pagetables:9816kB sec_pagetables:0kB all_unreclaimable? no
[ 8336.219638] Node 0 DMA free:324kB boost:2048kB min:2316kB low:2380kB high:2444kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:4kB inactive_file:14520kB unevictable:0kB writepending:7188kB present:15984kB managed:15360kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[ 8336.219649] lowmem_reserve[]: 0 2842 3735 0 0
[ 8336.219658] Node 0 DMA32 free:120636kB boost:115236kB min:166452kB low:179256kB high:192060kB reserved_highatomic:2048KB active_anon:26296kB inactive_anon:6060kB active_file:212688kB inactive_file:2215688kB unevictable:0kB writepending:238384kB present:3011464kB managed:2945824kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[ 8336.219668] lowmem_reserve[]: 0 0 893 0 0
[ 8336.219677] Node 0 Normal free:43740kB boost:36216kB min:52312kB low:56336kB high:60360kB reserved_highatomic:2048KB active_anon:11912kB inactive_anon:7784kB active_file:213860kB inactive_file:424616kB unevictable:16kB writepending:256796kB present:1048576kB managed:921916kB mlocked:0kB bounce:0kB free_pcp:896kB local_pcp:0kB free_cma:0kB
[ 8336.219688] lowmem_reserve[]: 0 0 0 0 0
[ 8336.219696] Node 0 DMA: 3*4kB (U) 3*8kB (U) 2*16kB (U) 8*32kB (U) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 324kB
[ 8336.219724] Node 0 DMA32: 5277*4kB (UMH) 5544*8kB (UMEH) 2664*16kB (UMEH) 361*32kB (UMEH) 9*64kB (H) 3*128kB (H) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 120596kB
[ 8336.219755] Node 0 Normal: 1618*4kB (UMEH) 2816*8kB (UMEH) 594*16kB (UMH) 188*32kB (UH) 4*64kB (UH) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 44776kB
[ 8336.219785] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 8336.219789] 770390 total pagecache pages
[ 8336.219791] 54 pages in swap cache
[ 8336.219792] Free swap  = 3711484kB
[ 8336.219794] Total swap = 3883004kB
[ 8336.219796] 1019006 pages RAM
[ 8336.219798] 0 pages HighMem/MovableOnly
[ 8336.219799] 48231 pages reserved
[ 8336.219801] 0 pages cma reserved
[ 8336.219803] 0 pages hwpoisoned
[ 8336.219812] BTRFS error (device sdd): zoned: failed to read zone 846645428224 on /dev/sdd (devid 1)
[ 8336.221330] BTRFS error (device sdd): zoned: failed to read zone 846913863680 on /dev/sdd (devid 1)
[ 8336.222228] BTRFS error (device sdd): zoned: failed to read zone 847182299136 on /dev/sdd (devid 1)
[ 8336.222969] BTRFS error (device sdd): zoned: failed to read zone 847450734592 on /dev/sdd (devid 1)
[ 8336.224564] BTRFS error (device sdd): zoned: failed to read zone 847719170048 on /dev/sdd (devid 1)
[ 8336.225848] BTRFS error (device sdd): zoned: failed to read zone 847987605504 on /dev/sdd (devid 1)

Still trying to find a way to reproduce the issue.

karuboniru commented 1 week ago

I tried to atttach the bpf program from the previous issue, and re-fromat the disk, and re-write the data. The error showed again:

[23876.672196] kworker/u18:2: vmalloc error: size 0, failed to allocate pages, mode:0x10dc2(GFP_KERNEL|__GFP_HIGHMEM|__GFP_NORETRY|__GFP_ZERO), nodemask=(null),cpuset=/,mems_allowed=0
[23876.672228] CPU: 2 UID: 0 PID: 156109 Comm: kworker/u18:2 Not tainted 6.11.3-200.fc40.x86_64 #1
[23876.672236] Hardware name: To be filled by O.E.M. To be filled by O.E.M./Aptio CRB, BIOS 5.6.5 12/20/2018
[23876.672240] Workqueue: events_unbound btrfs_async_reclaim_data_space
[23876.672252] Call Trace:
[23876.672257]  <TASK>
[23876.672264]  dump_stack_lvl+0x5d/0x80
[23876.672274]  warn_alloc+0x164/0x190
[23876.672281]  ? alloc_pages_mpol_noprof+0xd7/0x1e0
[23876.672289]  __vmalloc_node_range_noprof+0x7d6/0x870
[23876.672296]  ? sd_zbc_report_zones+0xde/0x5a0
[23876.672304]  __vmalloc_node_noprof+0x4e/0x70
[23876.672309]  ? sd_zbc_report_zones+0xde/0x5a0
[23876.672314]  sd_zbc_report_zones+0xde/0x5a0
[23876.672319]  ? __pfx_copy_zone_info_cb+0x10/0x10
[23876.672325]  ? __pfx_sd_zbc_report_zones+0x10/0x10
[23876.672330]  ? __pfx_copy_zone_info_cb+0x10/0x10
[23876.672335]  btrfs_get_dev_zones+0xc1/0x1d0
[23876.672340]  btrfs_load_block_group_zone_info+0x2f1/0xdd0
[23876.672346]  ? kmem_cache_alloc_noprof+0x12d/0x2f0
[23876.672353]  ? __kmalloc_cache_noprof+0x138/0x310
[23876.672360]  btrfs_make_block_group+0xda/0x2b0
[23876.672366]  btrfs_create_chunk+0x967/0xf10
[23876.672375]  btrfs_chunk_alloc+0x174/0x500
[23876.672383]  flush_space+0x122/0x570
[23876.672388]  ? finish_task_switch.isra.0+0x99/0x2e0
[23876.672395]  ? __schedule+0x434/0x1660
[23876.672403]  btrfs_async_reclaim_data_space+0x67/0x190
[23876.672409]  process_one_work+0x179/0x330
[23876.672416]  worker_thread+0x252/0x390
[23876.672423]  ? __pfx_worker_thread+0x10/0x10
[23876.672428]  kthread+0xd2/0x100
[23876.672433]  ? __pfx_kthread+0x10/0x10
[23876.672437]  ret_from_fork+0x34/0x50
[23876.672447]  ? __pfx_kthread+0x10/0x10
[23876.672451]  ret_from_fork_asm+0x1a/0x30
[23876.672461]  </TASK>
[23876.672566] Mem-Info:
[23876.672571] active_anon:27362 inactive_anon:35201 isolated_anon:0
                active_file:145472 inactive_file:526171 isolated_file:0
                unevictable:4 dirty:67450 writeback:8270
                slab_reclaimable:102501 slab_unreclaimable:54021
                mapped:27306 shmem:11487 pagetables:2463
                sec_pagetables:0 bounce:0
                kernel_misc_reclaimable:0
                free:49613 free_pcp:92 free_cma:0
[23876.672582] Node 0 active_anon:109448kB inactive_anon:140804kB active_file:581888kB inactive_file:2104684kB unevictable:16kB isolated(anon):0kB isolated(file):0kB mapped:109224kB dirty:269800kB writeback:32932kB shmem:45948kB shmem_thp:0kB shmem_pmdmapped:0kB anon_thp:0kB writeback_tmp:0kB kernel_stack:5888kB pagetables:9852kB sec_pagetables:0kB all_unreclaimable? no
[23876.672592] Node 0 DMA free:268kB boost:0kB min:268kB low:332kB high:396kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:14576kB unevictable:0kB writepending:0kB present:15984kB managed:15360kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[23876.672604] lowmem_reserve[]: 0 2842 3735 0 0
[23876.672613] Node 0 DMA32 free:172492kB boost:115236kB min:166452kB low:179256kB high:192060kB reserved_highatomic:2048KB active_anon:54924kB inactive_anon:76820kB active_file:300820kB inactive_file:1916564kB unevictable:0kB writepending:298496kB present:3011464kB managed:2945824kB mlocked:0kB bounce:0kB free_pcp:608kB local_pcp:0kB free_cma:0kB
[23876.672624] lowmem_reserve[]: 0 0 893 0 0
[23876.672633] Node 0 Normal free:25692kB boost:36216kB min:52312kB low:56336kB high:60360kB reserved_highatomic:2048KB active_anon:54524kB inactive_anon:63984kB active_file:280892kB inactive_file:174184kB unevictable:16kB writepending:4516kB present:1048576kB managed:921916kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[23876.672644] lowmem_reserve[]: 0 0 0 0 0
[23876.672652] Node 0 DMA: 1*4kB (U) 1*8kB (U) 0*16kB 8*32kB (U) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 268kB
[23876.672680] Node 0 DMA32: 9153*4kB (UME) 5967*8kB (UME) 2944*16kB (UME) 1025*32kB (UMEH) 127*64kB (UMEH) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 172380kB
[23876.672709] Node 0 Normal: 193*4kB (UME) 348*8kB (UMEH) 341*16kB (UMEH) 518*32kB (UMEH) 2*64kB (H) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 25716kB
[23876.672740] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[23876.672744] 683845 total pagecache pages
[23876.672746] 613 pages in swap cache
[23876.672748] Free swap  = 3759612kB
[23876.672750] Total swap = 3883004kB
[23876.672752] 1019006 pages RAM
[23876.672754] 0 pages HighMem/MovableOnly
[23876.672755] 48231 pages reserved
[23876.672757] 0 pages cma reserved
[23876.672759] 0 pages hwpoisoned
[23876.672771] BTRFS error (device sdd): zoned: failed to read zone 228170137600 on /dev/sdd (devid 1)

I am getting nothing special other from the trace: trace_pipe.txt, which is always lines like

worker/u18:1-97325   [003] ...21 24267.794484: bpf_trace_printk: nr_zones=1, max_hw_sectors=33550336, max_segments=688128
karuboniru commented 1 week ago

dmesg-patched-2.log

I tried to patch the kernel on the system following the comment here, first test with 6.12.0-rc4 did not reproduce the issue (unsure the reason, fixed or just coincidence?) The log attached is with the second try with 6.11.5. The place where I see the error is:

[ 9531.139017] nr_zones=1, sdkp->zone_info.nr_zones=74508, bufsize=512, sdkp->capacity=4882956288, sdkp->zone_info.zone_blocks=65536
[ 9533.057222] nr_zones=1, sdkp->zone_info.nr_zones=74508, bufsize=512, sdkp->capacity=4882956288, sdkp->zone_info.zone_blocks=65536
[ 9535.108599] nr_zones=1, sdkp->zone_info.nr_zones=74508, bufsize=512, sdkp->capacity=4882956288, sdkp->zone_info.zone_blocks=65536
[ 9537.143737] nr_zones=1, sdkp->zone_info.nr_zones=74508, bufsize=512, sdkp->capacity=4882956288, sdkp->zone_info.zone_blocks=65536
[ 9539.453483] nr_zones=1, sdkp->zone_info.nr_zones=74508, bufsize=512, sdkp->capacity=4882956288, sdkp->zone_info.zone_blocks=65536
[ 9541.661255] nr_zones=1, sdkp->zone_info.nr_zones=74508, bufsize=512, sdkp->capacity=4882956288, sdkp->zone_info.zone_blocks=65536
[ 9541.661764] kworker/u18:2: vmalloc error: size 0, failed to allocate pages, mode:0x10dc2(GFP_KERNEL|__GFP_HIGHMEM|__GFP_NORETRY|__GFP_ZERO), nodemask=(null),cpuset=/,mems_allowed=0
[ 9541.661789] CPU: 3 UID: 0 PID: 1915 Comm: kworker/u18:2 Not tainted 6.11.5-200.fc40.x86_64 #1
[ 9541.661797] Hardware name: To be filled by O.E.M. To be filled by O.E.M./Aptio CRB, BIOS 5.6.5 12/20/2018
[ 9541.661802] Workqueue: events_unbound btrfs_async_reclaim_data_space
[ 9541.661815] Call Trace:
[ 9541.661821]  <TASK>
[ 9541.661828]  dump_stack_lvl+0x5d/0x80
[ 9541.661840]  warn_alloc+0x164/0x190
[ 9541.661849]  ? alloc_pages_mpol_noprof+0xd7/0x1e0
[ 9541.661858]  __vmalloc_node_range_noprof+0x7d6/0x870
[ 9541.661866]  ? sd_zbc_report_zones.cold+0x73/0x5c3
[ 9541.661875]  ? _printk+0x6c/0x87
[ 9541.661882]  __vmalloc_node_noprof+0x4e/0x70
[ 9541.661888]  ? sd_zbc_report_zones.cold+0x73/0x5c3
[ 9541.661894]  sd_zbc_report_zones.cold+0x73/0x5c3
[ 9541.661901]  ? __pfx_copy_zone_info_cb+0x10/0x10
[ 9541.661908]  ? __slab_free+0xdf/0x2e0
[ 9541.661915]  btrfs_get_dev_zones+0xc1/0x1d0
[ 9541.661922]  btrfs_load_block_group_zone_info+0x2f1/0xe00
[ 9541.661929]  ? kmem_cache_alloc_noprof+0x12d/0x2f0
[ 9541.661937]  ? __kmalloc_cache_noprof+0x138/0x310
[ 9541.661945]  btrfs_make_block_group+0xda/0x2b0
[ 9541.661953]  btrfs_create_chunk+0x967/0xf10
[ 9541.661963]  btrfs_chunk_alloc+0x174/0x500
[ 9541.661972]  flush_space+0x122/0x570
[ 9541.661979]  ? finish_task_switch.isra.0+0x99/0x2e0
[ 9541.661987]  ? __schedule+0x434/0x1660
[ 9541.661995]  btrfs_async_reclaim_data_space+0x67/0x190
[ 9541.662003]  process_one_work+0x179/0x330
[ 9541.662012]  worker_thread+0x252/0x390
[ 9541.662020]  ? __pfx_worker_thread+0x10/0x10
[ 9541.662026]  kthread+0xd2/0x100
[ 9541.662032]  ? __pfx_kthread+0x10/0x10
[ 9541.662038]  ret_from_fork+0x34/0x50
[ 9541.662045]  ? __pfx_kthread+0x10/0x10
[ 9541.662051]  ret_from_fork_asm+0x1a/0x30
[ 9541.662060]  </TASK>
[ 9541.662116] Mem-Info:
[ 9541.662123] active_anon:6624 inactive_anon:2278 isolated_anon:0
                active_file:273370 inactive_file:530010 isolated_file:0
                unevictable:4 dirty:78968 writeback:49794
                slab_reclaimable:34355 slab_unreclaimable:48270
                mapped:8068 shmem:25 pagetables:1952
                sec_pagetables:0 bounce:0
                kernel_misc_reclaimable:0
                free:47074 free_pcp:159 free_cma:0
[ 9541.662136] Node 0 active_anon:26496kB inactive_anon:9112kB active_file:1093480kB inactive_file:2120040kB unevictable:16kB isolated(anon):0kB isolated(file):0kB mapped:32272kB dirty:315872kB writeback:199176kB shmem:100kB shmem_thp:0kB shmem_pmdmapped:0kB anon_thp:0kB writeback_tmp:0kB kernel_stack:5408kB pagetables:7808kB sec_pagetables:0kB all_unreclaimable? no
[ 9541.662149] Node 0 DMA free:268kB boost:0kB min:268kB low:332kB high:396kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:14404kB unevictable:0kB writepending:0kB present:15984kB managed:15360kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[ 9541.662162] lowmem_reserve[]: 0 2785 3735 0 0
[ 9541.662174] Node 0 DMA32 free:168116kB boost:112948kB min:163152kB low:175700kB high:188248kB reserved_highatomic:2048KB active_anon:16340kB inactive_anon:4752kB active_file:394684kB inactive_file:2077180kB unevictable:0kB writepending:515216kB present:3011464kB managed:2895616kB mlocked:0kB bounce:0kB free_pcp:728kB local_pcp:0kB free_cma:0kB
[ 9541.662188] lowmem_reserve[]: 0 0 949 0 0
[ 9541.662199] Node 0 Normal free:19912kB boost:38488kB min:55596kB low:59872kB high:64148kB reserved_highatomic:2048KB active_anon:10156kB inactive_anon:4360kB active_file:698796kB inactive_file:28656kB unevictable:16kB writepending:64kB present:1048576kB managed:972096kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[ 9541.662212] lowmem_reserve[]: 0 0 0 0 0
[ 9541.662223] Node 0 DMA: 1*4kB (U) 1*8kB (U) 0*16kB 8*32kB (U) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 268kB
[ 9541.662257] Node 0 DMA32: 11853*4kB (UME) 7416*8kB (UME) 2597*16kB (UMEH) 585*32kB (UMEH) 24*64kB (UMEH) 2*128kB (H) 1*256kB (H) 0*512kB 1*1024kB (H) 0*2048kB 0*4096kB = 170084kB
[ 9541.662325] Node 0 Normal: 130*4kB (U) 961*8kB (U) 460*16kB (UH) 84*32kB (UH) 2*64kB (H) 2*128kB (H) 1*256kB (H) 0*512kB 1*1024kB (H) 0*2048kB 0*4096kB = 19920kB
[ 9541.662368] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 9541.662373] 803382 total pagecache pages
[ 9541.662376] 46 pages in swap cache
[ 9541.662378] Free swap  = 3772412kB
[ 9541.662380] Total swap = 3883004kB
[ 9541.662383] 1019006 pages RAM
[ 9541.662385] 0 pages HighMem/MovableOnly
[ 9541.662387] 48238 pages reserved
[ 9541.662389] 0 pages cma reserved
[ 9541.662391] 0 pages hwpoisoned
[ 9541.662402] BTRFS error (device sdd): zoned: failed to read zone 879662989312 on /dev/sdd (devid 1)
[ 9541.662826] nr_zones=1, sdkp->zone_info.nr_zones=74508, bufsize=512, sdkp->capacity=4882956288, sdkp->zone_info.zone_blocks=65536
[ 9543.679052] nr_zones=1, sdkp->zone_info.nr_zones=74508, bufsize=512, sdkp->capacity=4882956288, sdkp->zone_info.zone_blocks=65536
[ 9545.875303] nr_zones=1, sdkp->zone_info.nr_zones=74508, bufsize=512, sdkp->capacity=4882956288, sdkp->zone_info.zone_blocks=65536
[ 9548.044731] nr_zones=1, sdkp->zone_info.nr_zones=74508, bufsize=512, sdkp->capacity=4882956288, sdkp->zone_info.zone_blocks=65536
[ 9550.326888] nr_zones=1, sdkp->zone_info.nr_zones=74508, bufsize=512, sdkp->capacity=4882956288, sdkp->zone_info.zone_blocks=65536
[ 9552.176217] nr_zones=1, sdkp->zone_info.nr_zones=74508, bufsize=512, sdkp->capacity=4882956288, sdkp->zone_info.zone_blocks=65536

Unfortunately, I can't see anything special from the output, is there anything I can do to address the source of the issue?

adam900710 commented 1 week ago

This makes no sense, the printk shows the bufsize is 512, but vmalloc still got size 0.

morbidrsa commented 1 week ago

@adam900710 I'm not sure we're doing a __vmalloc(0) here:

 3662         if (area->nr_pages != nr_small_pages) {
 3663                 /*
 3664                  * vm_area_alloc_pages() can fail due to insufficient memory but
 3665                  * also:-
 3666                  *
 3667                  * - a pending fatal signal
 3668                  * - insufficient huge page-order pages
 3669                  *
 3670                  * Since we always retry allocations at order-0 in the huge page
 3671                  * case a warning for either is spurious
 3672                  */
 3673                 if (!fatal_signal_pending(current) && page_order == 0)
 3674                         warn_alloc(gfp_mask, NULL,
 3675                                 "vmalloc error: size %lu, failed to allocate pages",
 3676                                 area->nr_pages * PAGE_SIZE);
 3677                 goto fail;

So maybe vmalloc(512) is too small?

@karuboniru is it possible for you to test this patch:

diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index ee2b74238758..e35f4554978d 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -188,8 +188,8 @@ static void *sd_zbc_alloc_report_buffer(struct scsi_disk *sdkp,
        bufsize = min_t(size_t, bufsize, queue_max_segments(q) << PAGE_SHIFT);

        while (bufsize >= SECTOR_SIZE) {
-               buf = __vmalloc(bufsize,
-                               GFP_KERNEL | __GFP_ZERO | __GFP_NORETRY);
+               buf = kvmalloc(bufsize,
+                              GFP_KERNEL | __GFP_ZERO | __GFP_NORETRY);
                if (buf) {
                        *buflen = bufsize;
karuboniru commented 1 week ago

(edited, have to double check sth)

I just found I booted with unpatched kernel mistakenly (and this means the issue is not that reproducible...)

Anyway I will start testing with patch then.

morbidrsa commented 1 week ago

What’s the workload you’re using to trigger this?

karuboniru commented 1 week ago

Copying 1.7 TiB of fiiles to the volume (containing big sized files like disk images and small files like source codes), on a relatively cheap machine (J1900 with 4 GB of ram).

karuboniru commented 1 week ago

Another try (total > 3.5 TB of write) did not reproduce the issue with 6.11.5 (patched), should I do further test?

morbidrsa commented 1 week ago

If you can perform further tests, I'd be very happy.

karuboniru commented 1 week ago

Did another pass (formart, write 2T), no such error showed up.

morbidrsa commented 6 days ago

Thanks, I'll submit a formal patch to the scsi list today.

morbidrsa commented 6 days ago

Patch is submitted here: https://lore.kernel.org/linux-scsi/20241030110253.11718-1-jth@kernel.org/