Open karuboniru opened 1 week ago
I tried to atttach the bpf program from the previous issue, and re-fromat the disk, and re-write the data. The error showed again:
[23876.672196] kworker/u18:2: vmalloc error: size 0, failed to allocate pages, mode:0x10dc2(GFP_KERNEL|__GFP_HIGHMEM|__GFP_NORETRY|__GFP_ZERO), nodemask=(null),cpuset=/,mems_allowed=0
[23876.672228] CPU: 2 UID: 0 PID: 156109 Comm: kworker/u18:2 Not tainted 6.11.3-200.fc40.x86_64 #1
[23876.672236] Hardware name: To be filled by O.E.M. To be filled by O.E.M./Aptio CRB, BIOS 5.6.5 12/20/2018
[23876.672240] Workqueue: events_unbound btrfs_async_reclaim_data_space
[23876.672252] Call Trace:
[23876.672257] <TASK>
[23876.672264] dump_stack_lvl+0x5d/0x80
[23876.672274] warn_alloc+0x164/0x190
[23876.672281] ? alloc_pages_mpol_noprof+0xd7/0x1e0
[23876.672289] __vmalloc_node_range_noprof+0x7d6/0x870
[23876.672296] ? sd_zbc_report_zones+0xde/0x5a0
[23876.672304] __vmalloc_node_noprof+0x4e/0x70
[23876.672309] ? sd_zbc_report_zones+0xde/0x5a0
[23876.672314] sd_zbc_report_zones+0xde/0x5a0
[23876.672319] ? __pfx_copy_zone_info_cb+0x10/0x10
[23876.672325] ? __pfx_sd_zbc_report_zones+0x10/0x10
[23876.672330] ? __pfx_copy_zone_info_cb+0x10/0x10
[23876.672335] btrfs_get_dev_zones+0xc1/0x1d0
[23876.672340] btrfs_load_block_group_zone_info+0x2f1/0xdd0
[23876.672346] ? kmem_cache_alloc_noprof+0x12d/0x2f0
[23876.672353] ? __kmalloc_cache_noprof+0x138/0x310
[23876.672360] btrfs_make_block_group+0xda/0x2b0
[23876.672366] btrfs_create_chunk+0x967/0xf10
[23876.672375] btrfs_chunk_alloc+0x174/0x500
[23876.672383] flush_space+0x122/0x570
[23876.672388] ? finish_task_switch.isra.0+0x99/0x2e0
[23876.672395] ? __schedule+0x434/0x1660
[23876.672403] btrfs_async_reclaim_data_space+0x67/0x190
[23876.672409] process_one_work+0x179/0x330
[23876.672416] worker_thread+0x252/0x390
[23876.672423] ? __pfx_worker_thread+0x10/0x10
[23876.672428] kthread+0xd2/0x100
[23876.672433] ? __pfx_kthread+0x10/0x10
[23876.672437] ret_from_fork+0x34/0x50
[23876.672447] ? __pfx_kthread+0x10/0x10
[23876.672451] ret_from_fork_asm+0x1a/0x30
[23876.672461] </TASK>
[23876.672566] Mem-Info:
[23876.672571] active_anon:27362 inactive_anon:35201 isolated_anon:0
active_file:145472 inactive_file:526171 isolated_file:0
unevictable:4 dirty:67450 writeback:8270
slab_reclaimable:102501 slab_unreclaimable:54021
mapped:27306 shmem:11487 pagetables:2463
sec_pagetables:0 bounce:0
kernel_misc_reclaimable:0
free:49613 free_pcp:92 free_cma:0
[23876.672582] Node 0 active_anon:109448kB inactive_anon:140804kB active_file:581888kB inactive_file:2104684kB unevictable:16kB isolated(anon):0kB isolated(file):0kB mapped:109224kB dirty:269800kB writeback:32932kB shmem:45948kB shmem_thp:0kB shmem_pmdmapped:0kB anon_thp:0kB writeback_tmp:0kB kernel_stack:5888kB pagetables:9852kB sec_pagetables:0kB all_unreclaimable? no
[23876.672592] Node 0 DMA free:268kB boost:0kB min:268kB low:332kB high:396kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:14576kB unevictable:0kB writepending:0kB present:15984kB managed:15360kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[23876.672604] lowmem_reserve[]: 0 2842 3735 0 0
[23876.672613] Node 0 DMA32 free:172492kB boost:115236kB min:166452kB low:179256kB high:192060kB reserved_highatomic:2048KB active_anon:54924kB inactive_anon:76820kB active_file:300820kB inactive_file:1916564kB unevictable:0kB writepending:298496kB present:3011464kB managed:2945824kB mlocked:0kB bounce:0kB free_pcp:608kB local_pcp:0kB free_cma:0kB
[23876.672624] lowmem_reserve[]: 0 0 893 0 0
[23876.672633] Node 0 Normal free:25692kB boost:36216kB min:52312kB low:56336kB high:60360kB reserved_highatomic:2048KB active_anon:54524kB inactive_anon:63984kB active_file:280892kB inactive_file:174184kB unevictable:16kB writepending:4516kB present:1048576kB managed:921916kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[23876.672644] lowmem_reserve[]: 0 0 0 0 0
[23876.672652] Node 0 DMA: 1*4kB (U) 1*8kB (U) 0*16kB 8*32kB (U) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 268kB
[23876.672680] Node 0 DMA32: 9153*4kB (UME) 5967*8kB (UME) 2944*16kB (UME) 1025*32kB (UMEH) 127*64kB (UMEH) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 172380kB
[23876.672709] Node 0 Normal: 193*4kB (UME) 348*8kB (UMEH) 341*16kB (UMEH) 518*32kB (UMEH) 2*64kB (H) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 25716kB
[23876.672740] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[23876.672744] 683845 total pagecache pages
[23876.672746] 613 pages in swap cache
[23876.672748] Free swap = 3759612kB
[23876.672750] Total swap = 3883004kB
[23876.672752] 1019006 pages RAM
[23876.672754] 0 pages HighMem/MovableOnly
[23876.672755] 48231 pages reserved
[23876.672757] 0 pages cma reserved
[23876.672759] 0 pages hwpoisoned
[23876.672771] BTRFS error (device sdd): zoned: failed to read zone 228170137600 on /dev/sdd (devid 1)
I am getting nothing special other from the trace: trace_pipe.txt, which is always lines like
worker/u18:1-97325 [003] ...21 24267.794484: bpf_trace_printk: nr_zones=1, max_hw_sectors=33550336, max_segments=688128
I tried to patch the kernel on the system following the comment here, first test with 6.12.0-rc4
did not reproduce the issue (unsure the reason, fixed or just coincidence?) The log attached is with the second try with 6.11.5
. The place where I see the error is:
[ 9531.139017] nr_zones=1, sdkp->zone_info.nr_zones=74508, bufsize=512, sdkp->capacity=4882956288, sdkp->zone_info.zone_blocks=65536
[ 9533.057222] nr_zones=1, sdkp->zone_info.nr_zones=74508, bufsize=512, sdkp->capacity=4882956288, sdkp->zone_info.zone_blocks=65536
[ 9535.108599] nr_zones=1, sdkp->zone_info.nr_zones=74508, bufsize=512, sdkp->capacity=4882956288, sdkp->zone_info.zone_blocks=65536
[ 9537.143737] nr_zones=1, sdkp->zone_info.nr_zones=74508, bufsize=512, sdkp->capacity=4882956288, sdkp->zone_info.zone_blocks=65536
[ 9539.453483] nr_zones=1, sdkp->zone_info.nr_zones=74508, bufsize=512, sdkp->capacity=4882956288, sdkp->zone_info.zone_blocks=65536
[ 9541.661255] nr_zones=1, sdkp->zone_info.nr_zones=74508, bufsize=512, sdkp->capacity=4882956288, sdkp->zone_info.zone_blocks=65536
[ 9541.661764] kworker/u18:2: vmalloc error: size 0, failed to allocate pages, mode:0x10dc2(GFP_KERNEL|__GFP_HIGHMEM|__GFP_NORETRY|__GFP_ZERO), nodemask=(null),cpuset=/,mems_allowed=0
[ 9541.661789] CPU: 3 UID: 0 PID: 1915 Comm: kworker/u18:2 Not tainted 6.11.5-200.fc40.x86_64 #1
[ 9541.661797] Hardware name: To be filled by O.E.M. To be filled by O.E.M./Aptio CRB, BIOS 5.6.5 12/20/2018
[ 9541.661802] Workqueue: events_unbound btrfs_async_reclaim_data_space
[ 9541.661815] Call Trace:
[ 9541.661821] <TASK>
[ 9541.661828] dump_stack_lvl+0x5d/0x80
[ 9541.661840] warn_alloc+0x164/0x190
[ 9541.661849] ? alloc_pages_mpol_noprof+0xd7/0x1e0
[ 9541.661858] __vmalloc_node_range_noprof+0x7d6/0x870
[ 9541.661866] ? sd_zbc_report_zones.cold+0x73/0x5c3
[ 9541.661875] ? _printk+0x6c/0x87
[ 9541.661882] __vmalloc_node_noprof+0x4e/0x70
[ 9541.661888] ? sd_zbc_report_zones.cold+0x73/0x5c3
[ 9541.661894] sd_zbc_report_zones.cold+0x73/0x5c3
[ 9541.661901] ? __pfx_copy_zone_info_cb+0x10/0x10
[ 9541.661908] ? __slab_free+0xdf/0x2e0
[ 9541.661915] btrfs_get_dev_zones+0xc1/0x1d0
[ 9541.661922] btrfs_load_block_group_zone_info+0x2f1/0xe00
[ 9541.661929] ? kmem_cache_alloc_noprof+0x12d/0x2f0
[ 9541.661937] ? __kmalloc_cache_noprof+0x138/0x310
[ 9541.661945] btrfs_make_block_group+0xda/0x2b0
[ 9541.661953] btrfs_create_chunk+0x967/0xf10
[ 9541.661963] btrfs_chunk_alloc+0x174/0x500
[ 9541.661972] flush_space+0x122/0x570
[ 9541.661979] ? finish_task_switch.isra.0+0x99/0x2e0
[ 9541.661987] ? __schedule+0x434/0x1660
[ 9541.661995] btrfs_async_reclaim_data_space+0x67/0x190
[ 9541.662003] process_one_work+0x179/0x330
[ 9541.662012] worker_thread+0x252/0x390
[ 9541.662020] ? __pfx_worker_thread+0x10/0x10
[ 9541.662026] kthread+0xd2/0x100
[ 9541.662032] ? __pfx_kthread+0x10/0x10
[ 9541.662038] ret_from_fork+0x34/0x50
[ 9541.662045] ? __pfx_kthread+0x10/0x10
[ 9541.662051] ret_from_fork_asm+0x1a/0x30
[ 9541.662060] </TASK>
[ 9541.662116] Mem-Info:
[ 9541.662123] active_anon:6624 inactive_anon:2278 isolated_anon:0
active_file:273370 inactive_file:530010 isolated_file:0
unevictable:4 dirty:78968 writeback:49794
slab_reclaimable:34355 slab_unreclaimable:48270
mapped:8068 shmem:25 pagetables:1952
sec_pagetables:0 bounce:0
kernel_misc_reclaimable:0
free:47074 free_pcp:159 free_cma:0
[ 9541.662136] Node 0 active_anon:26496kB inactive_anon:9112kB active_file:1093480kB inactive_file:2120040kB unevictable:16kB isolated(anon):0kB isolated(file):0kB mapped:32272kB dirty:315872kB writeback:199176kB shmem:100kB shmem_thp:0kB shmem_pmdmapped:0kB anon_thp:0kB writeback_tmp:0kB kernel_stack:5408kB pagetables:7808kB sec_pagetables:0kB all_unreclaimable? no
[ 9541.662149] Node 0 DMA free:268kB boost:0kB min:268kB low:332kB high:396kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:14404kB unevictable:0kB writepending:0kB present:15984kB managed:15360kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[ 9541.662162] lowmem_reserve[]: 0 2785 3735 0 0
[ 9541.662174] Node 0 DMA32 free:168116kB boost:112948kB min:163152kB low:175700kB high:188248kB reserved_highatomic:2048KB active_anon:16340kB inactive_anon:4752kB active_file:394684kB inactive_file:2077180kB unevictable:0kB writepending:515216kB present:3011464kB managed:2895616kB mlocked:0kB bounce:0kB free_pcp:728kB local_pcp:0kB free_cma:0kB
[ 9541.662188] lowmem_reserve[]: 0 0 949 0 0
[ 9541.662199] Node 0 Normal free:19912kB boost:38488kB min:55596kB low:59872kB high:64148kB reserved_highatomic:2048KB active_anon:10156kB inactive_anon:4360kB active_file:698796kB inactive_file:28656kB unevictable:16kB writepending:64kB present:1048576kB managed:972096kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[ 9541.662212] lowmem_reserve[]: 0 0 0 0 0
[ 9541.662223] Node 0 DMA: 1*4kB (U) 1*8kB (U) 0*16kB 8*32kB (U) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 268kB
[ 9541.662257] Node 0 DMA32: 11853*4kB (UME) 7416*8kB (UME) 2597*16kB (UMEH) 585*32kB (UMEH) 24*64kB (UMEH) 2*128kB (H) 1*256kB (H) 0*512kB 1*1024kB (H) 0*2048kB 0*4096kB = 170084kB
[ 9541.662325] Node 0 Normal: 130*4kB (U) 961*8kB (U) 460*16kB (UH) 84*32kB (UH) 2*64kB (H) 2*128kB (H) 1*256kB (H) 0*512kB 1*1024kB (H) 0*2048kB 0*4096kB = 19920kB
[ 9541.662368] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 9541.662373] 803382 total pagecache pages
[ 9541.662376] 46 pages in swap cache
[ 9541.662378] Free swap = 3772412kB
[ 9541.662380] Total swap = 3883004kB
[ 9541.662383] 1019006 pages RAM
[ 9541.662385] 0 pages HighMem/MovableOnly
[ 9541.662387] 48238 pages reserved
[ 9541.662389] 0 pages cma reserved
[ 9541.662391] 0 pages hwpoisoned
[ 9541.662402] BTRFS error (device sdd): zoned: failed to read zone 879662989312 on /dev/sdd (devid 1)
[ 9541.662826] nr_zones=1, sdkp->zone_info.nr_zones=74508, bufsize=512, sdkp->capacity=4882956288, sdkp->zone_info.zone_blocks=65536
[ 9543.679052] nr_zones=1, sdkp->zone_info.nr_zones=74508, bufsize=512, sdkp->capacity=4882956288, sdkp->zone_info.zone_blocks=65536
[ 9545.875303] nr_zones=1, sdkp->zone_info.nr_zones=74508, bufsize=512, sdkp->capacity=4882956288, sdkp->zone_info.zone_blocks=65536
[ 9548.044731] nr_zones=1, sdkp->zone_info.nr_zones=74508, bufsize=512, sdkp->capacity=4882956288, sdkp->zone_info.zone_blocks=65536
[ 9550.326888] nr_zones=1, sdkp->zone_info.nr_zones=74508, bufsize=512, sdkp->capacity=4882956288, sdkp->zone_info.zone_blocks=65536
[ 9552.176217] nr_zones=1, sdkp->zone_info.nr_zones=74508, bufsize=512, sdkp->capacity=4882956288, sdkp->zone_info.zone_blocks=65536
Unfortunately, I can't see anything special from the output, is there anything I can do to address the source of the issue?
This makes no sense, the printk shows the bufsize is 512, but vmalloc still got size 0.
@adam900710 I'm not sure we're doing a __vmalloc(0) here:
3662 if (area->nr_pages != nr_small_pages) {
3663 /*
3664 * vm_area_alloc_pages() can fail due to insufficient memory but
3665 * also:-
3666 *
3667 * - a pending fatal signal
3668 * - insufficient huge page-order pages
3669 *
3670 * Since we always retry allocations at order-0 in the huge page
3671 * case a warning for either is spurious
3672 */
3673 if (!fatal_signal_pending(current) && page_order == 0)
3674 warn_alloc(gfp_mask, NULL,
3675 "vmalloc error: size %lu, failed to allocate pages",
3676 area->nr_pages * PAGE_SIZE);
3677 goto fail;
So maybe vmalloc(512) is too small?
@karuboniru is it possible for you to test this patch:
diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index ee2b74238758..e35f4554978d 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -188,8 +188,8 @@ static void *sd_zbc_alloc_report_buffer(struct scsi_disk *sdkp,
bufsize = min_t(size_t, bufsize, queue_max_segments(q) << PAGE_SHIFT);
while (bufsize >= SECTOR_SIZE) {
- buf = __vmalloc(bufsize,
- GFP_KERNEL | __GFP_ZERO | __GFP_NORETRY);
+ buf = kvmalloc(bufsize,
+ GFP_KERNEL | __GFP_ZERO | __GFP_NORETRY);
if (buf) {
*buflen = bufsize;
(edited, have to double check sth)
I just found I booted with unpatched kernel mistakenly (and this means the issue is not that reproducible...)
Anyway I will start testing with patch then.
What’s the workload you’re using to trigger this?
Copying 1.7 TiB of fiiles to the volume (containing big sized files like disk images and small files like source codes), on a relatively cheap machine (J1900 with 4 GB of ram).
Another try (total > 3.5 TB of write) did not reproduce the issue with 6.11.5 (patched), should I do further test?
If you can perform further tests, I'd be very happy.
Did another pass (formart, write 2T), no such error showed up.
Thanks, I'll submit a formal patch to the scsi list today.
Patch is submitted here: https://lore.kernel.org/linux-scsi/20241030110253.11718-1-jth@kernel.org/
Dmesg: https://gist.github.com/karuboniru/b9cf4c2d5ba290e3c4f7bd7091ca67fa
Happens on
6.11.3-200.fc40.x86_64
, when writing ~2TB of data to a newly formatted HC650 disk. No further error is detected after seeing the error. The write seems to have succeed.Might be related to https://github.com/kdave/btrfs-progs/issues/779 but the volume did not go RO afterwards
Still trying to find a way to reproduce the issue.