axboe / fio

Flexible I/O Tester
GNU General Public License v2.0
5.26k stars 1.26k forks source link

fio hangs in case of slab fails to allocate contagious memory allocation (xfs failed to allocate memory due to memory fragmentation) #254

Closed skmuttlu closed 8 years ago

skmuttlu commented 8 years ago

Hi,

Let me start with little bit background, however just a brief. Intermittently, however more frequently some of the random write test cases with block size split were failing due to fio hung issue. I have been observing this since last few days, but couldn’t repro it frequently. FIO hung issue was bit annoying to me where test would fail due to fio hung state.

why fio was hung (in deadlock state)? FIO would look for contiguous memory allocation, however I don’t see related sys log messages to say that what size of pages have been requested. This could be the potential reason being extreme memory fragmentation and/or XFS memory pressure issue led FIO into deadlock state. Some specific test cases (FIO workloads) induces extremely fragmented filesystem by involving tailored FIO workloads, and read/write on the same would induce high memory pressure as well. One of these reason might have cause memory extreme fragmentation and failed SLAB memory allocation.

Link to xfs issue: https://bugzilla.kernel.org/show_bug.cgi?id=73831

My tests would run the test with defined runtime (runtime=1200) and kill if fio job(s) exceeds 1200s runtime. I have been observing this issue very often since couple of months on xfs filesystem, where FIO was not honoring defined runtime b’css of hung/deadlock state ( due SLAB memory allocation deadlock), and/or sometime killing (pkill -9) dishonored FIO procs in this state resulting defunct process (/corrupted state).

Here is the substantial proof for the same, where memory allocation was hung...

[Mon Oct 10 21:08:53 2016] XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250) [Mon Oct 10 21:08:54 2016] XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250) [Mon Oct 10 21:08:55 2016] XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250) [Mon Oct 10 21:08:55 2016] XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250) [Mon Oct 10 21:08:55 2016] XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250) [Mon Oct 10 21:08:56 2016] XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250) [Mon Oct 10 21:08:56 2016] XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250) ATF sends ‘pkill -9’ to dishonored FIO procs, in this case deadlock procs which led proc into defunct state… r3a4-28:~ # ps -aef | grep 9163 root 9163 1 0 15:57 ? 00:00:05 /usr/local/staf/bin/STAFProc root 18111 9163 47 20:48 ? 00:21:17 [fio] root 18125 9163 47 20:48 ? 00:21:12 [fio] root 18171 9163 47 20:48 ? 00:21:15 [fio] root 18186 9163 47 20:48 ? 00:21:11 [fio] FIO process traces. 2016-10-10T20:14:39.134180-07:00 r3a4-28 kernel: [15561.432108] fio: page allocation failure: order:1, mode:0x200020 2016-10-10T20:14:39.134202-07:00 r3a4-28 kernel: [15561.432114] CPU: 2 PID: 16735 Comm: fio Tainted: G X 3.12.28-4-default #1 2016-10-10T20:14:39.134203-07:00 r3a4-28 kernel: [15561.432115] Hardware name: Supermicro X9SRW-F/X9SRW-F, BIOS 3.2 01/15/2015 2016-10-10T20:14:39.134203-07:00 r3a4-28 kernel: [15561.432117] ffff88047fc438c0 ffffffff8150b1db 0000000000200020 ffffffff8113af90 2016-10-10T20:14:39.134204-07:00 r3a4-28 kernel: [15561.432121] ffffffff8107793e 0000000000000000 ffff88047ff99b00 000000027ff98000 2016-10-10T20:14:39.134205-07:00 r3a4-28 kernel: [15561.432123] ffff88047ffbdf68 0000000000000046 0000000000000000 0000000000000000 2016-10-10T20:14:39.164051-07:00 r3a4-28 kernel: [15561.432126] Call Trace: 2016-10-10T20:14:39.164066-07:00 r3a4-28 kernel: [15561.432139] [] dump_trace+0x7d/0x2d0 2016-10-10T20:14:39.164068-07:00 r3a4-28 kernel: [15561.432143] [] show_stack_log_lvl+0x94/0x170 2016-10-10T20:14:39.164069-07:00 r3a4-28 kernel: [15561.432147] [] show_stack+0x21/0x50 2016-10-10T20:14:39.164070-07:00 r3a4-28 kernel: [15561.432152] [] dump_stack+0x41/0x51 2016-10-10T20:14:39.164071-07:00 r3a4-28 kernel: [15561.432158] [] warn_alloc_failed+0xf0/0x160 2016-10-10T20:14:39.164072-07:00 r3a4-28 kernel: [15561.432163] [] __alloc_pages_slowpath+0x72f/0x796 2016-10-10T20:14:39.164197-07:00 r3a4-28 kernel: [15561.432166] [] __alloc_pages_nodemask+0x1ea/0x210 2016-10-10T20:14:39.164202-07:00 r3a4-28 kernel: [15561.432172] [] kmem_getpages+0x58/0x1a0 2016-10-10T20:14:39.164203-07:00 r3a4-28 kernel: [15561.432176] [] fallback_alloc+0x20f/0x250 2016-10-10T20:14:39.164204-07:00 r3a4-28 kernel: [15561.432180] [] kmem_cache_alloc+0x11c/0x440 2016-10-10T20:14:39.164205-07:00 r3a4-28 kernel: [15561.432185] [] sk_prot_alloc+0x34/0x180 2016-10-10T20:14:39.164206-07:00 r3a4-28 kernel: [15561.432189] [] sk_clone_lock+0x1b/0x2e0 2016-10-10T20:14:39.164207-07:00 r3a4-28 kernel: [15561.432194] [] inet_csk_clone_lock+0x11/0x140 2016-10-10T20:14:39.164344-07:00 r3a4-28 kernel: [15561.432199] [] tcp_create_openreq_child+0x1e/0x430 2016-10-10T20:14:39.164348-07:00 r3a4-28 kernel: [15561.432202] [] tcp_v4_syn_recv_sock+0x38/0x250 2016-10-10T20:14:39.164350-07:00 r3a4-28 kernel: [15561.432206] [] tcp_check_req+0x3b9/0x530 2016-10-10T20:14:39.164351-07:00 r3a4-28 kernel: [15561.432212] [] tcp_v4_do_rcv+0x1dc/0x300 2016-10-10T20:14:39.164352-07:00 r3a4-28 kernel: [15561.432217] [] tcp_v4_rcv+0x75f/0x780 2016-10-10T20:14:39.164353-07:00 r3a4-28 kernel: [15561.432221] [] ip_local_deliver_finish+0x9e/0x200 2016-10-10T20:14:39.164356-07:00 r3a4-28 kernel: [15561.432226] [] __netif_receive_skb_core+0x627/0x800 2016-10-10T20:14:39.164357-07:00 r3a4-28 kernel: [15561.432230] [] netif_receive_skb+0x1f/0x90 2016-10-10T20:14:39.164358-07:00 r3a4-28 kernel: [15561.432234] [] napi_gro_receive+0x68/0x90 2016-10-10T20:14:39.164359-07:00 r3a4-28 kernel: [15561.432244] [] igb_clean_rx_irq+0x3be/0x8a0 [igb] 2016-10-10T20:14:39.164361-07:00 r3a4-28 kernel: [15561.432261] [] igb_poll+0x2e2/0x760 [igb] 2016-10-10T20:14:39.164361-07:00 r3a4-28 kernel: [15561.432270] [] net_rx_action+0x140/0x240 2016-10-10T20:14:39.164362-07:00 r3a4-28 kernel: [15561.432275] [] __do_softirq+0xe5/0x230 2016-10-10T20:14:39.164495-07:00 r3a4-28 kernel: [15561.432280] [] call_softirq+0x1c/0x30 2016-10-10T20:14:39.164500-07:00 r3a4-28 kernel: [15561.432283] [] do_softirq+0x55/0x90 2016-10-10T20:14:39.164501-07:00 r3a4-28 kernel: [15561.432286] [] irq_exit+0x95/0xa0 2016-10-10T20:14:39.164502-07:00 r3a4-28 kernel: [15561.432290] [] do_IRQ+0x4e/0xb0 2016-10-10T20:14:39.164503-07:00 r3a4-28 kernel: [15561.432295] [] common_interrupt+0x6d/0x6d 2016-10-10T20:14:39.164504-07:00 r3a4-28 kernel: [15561.432308] [<0000000000431b38>] 0x431b37 2016-10-10T20:14:39.164505-07:00 r3a4-28 kernel: [15561.432309] SLAB: Unable to allocate memory on node 0 (gfp=0x20) 2016-10-10T20:14:39.164509-07:00 r3a4-28 kernel: [15561.432310] cache: TCP, object size: 1728, order: 1 2016-10-10T20:14:39.164510-07:00 r3a4-28 kernel: [15561.432313] node 0: slabs: 5/5, objs: 20/20, free: 0 2016-10-10T20:14:39.164511-07:00 r3a4-28 kernel: [15561.432624] fio: page allocation failure: order:1, mode:0x200020 2016-10-10T20:14:39.164512-07:00 r3a4-28 kernel: [15561.432626] CPU: 2 PID: 16735 Comm: fio Tainted: G X 3.12.28-4-default #1 2016-10-10T20:14:39.164513-07:00 r3a4-28 kernel: [15561.432627] Hardware name: Supermicro X9SRW-F/X9SRW-F, BIOS 3.2 01/15/2015 2016-10-10T20:14:39.164514-07:00 r3a4-28 kernel: [15561.432628] ffff88047fc438c0 ffffffff8150b1db 0000000000200020 ffffffff8113af90 2016-10-10T20:14:39.164505-07:00 r3a4-28 kernel: [15561.432309] SLAB: Unable to allocate memory on node 0 (gfp=0x20) 2016-10-10T20:14:39.164509-07:00 r3a4-28 kernel: [15561.432310] cache: TCP, object size: 1728, order: 1 2016-10-10T20:14:39.164510-07:00 r3a4-28 kernel: [15561.432313] node 0: slabs: 5/5, objs: 20/20, free: 0 2016-10-10T20:14:39.164511-07:00 r3a4-28 kernel: [15561.432624] fio: page allocation failure: order:1, mode:0x200020 2016-10-10T20:14:39.164512-07:00 r3a4-28 kernel: [15561.432626] CPU: 2 PID: 16735 Comm: fio Tainted: G X 3.12.28-4-default #1 2016-10-10T20:14:39.164513-07:00 r3a4-28 kernel: [15561.432627] Hardware name: Supermicro X9SRW-F/X9SRW-F, BIOS 3.2 01/15/2015 2016-10-10T20:14:39.164514-07:00 r3a4-28 kernel: [15561.432628] ffff88047fc438c0 ffffffff8150b1db 0000000000200020 ffffffff8113af90 2016-10-10T20:14:39.164669-07:00 r3a4-28 kernel: [15561.432631] 0000000000000000 0000000000000000 ffff88047ff99b00 0000000200000000 2016-10-10T20:14:39.164673-07:00 r3a4-28 kernel: [15561.432633] 0000000000000002 0000000000000000 00ff880400000000 0000000000000000 I was thinking that following procedure would resolve the issue, however it did not help to prevent the issue: ``` I observed this with more than 60% free memory as well(+ buffer/cache) ``` 1. Clearing aches: Drop the caches "echo 3 > /proc/sys/vm/drop_caches" or “echo 1 > /proc/sys/vm/drop_caches”, executing this for every 2 minutes 1. ”echo 1 > /proc/sys/vm/compact_memory", (with CONFIG_COMPACTION=y) 2. Reduce huge page size to 128K Aforementioned procedure did not help to mitigate/prevent the issue Here is Before reboot (even after compact) r3a4-28:~ # cat /proc/buddyinfo │·· Node 0, zone DMA 1 0 0 1 0 1 1 0 1 1 3 │·· Node 0, zone DMA32 26713 23984 16125 7850 2725 916 526 244 93 78 4 │·· Node 0, zone Normal 326417 256905 167300 76595 21201 4638 1814 778 207 371 3 after reboot: r3a4-28:~ # cat /proc/buddyinfo │·· Node 0, zone DMA 1 0 0 1 0 1 1 0 1 1 3 │·· Node 0, zone DMA32 197 117 39 9 1 2 2 4 2 2 426 │·· Node 0, zone Normal 30 487 133 51 8 3 12 11 6 4 3347 Questions: Just curious to know is there anyway I can prevent this issue from fio side by dynamically switching contiguous mem request/allocation (kind of fall back option)?
axboe commented 8 years ago

This isn't a fio issue (it's a kernel bug), nor is it fio asking for a memory allocation that fails. There's nothing that fio can or should do to prevent this, it's something that should be fixed on the kernel side.