liyi-ibm / linux

Linux kernel source tree
Other
0 stars 1 forks source link

cpu hard lockup caused by memory #12

Open liyi-ibm opened 5 years ago

liyi-ibm commented 5 years ago

On 4.14.49-3 kernel (without the memory control patch)

[Tue Dec 11 03:10:46 2018] Watchdog CPU:126 Hard LOCKUP
[Tue Dec 11 03:10:46 2018] Modules linked in: binfmt_misc i2c_dev ses enclosure scsi_transport_sas at24 ofpart powernv_flash opal_prd ipmi_powernv i2c_opal ipmi_devintf ipmi_msghandler mtd nfsd auth_rpcgss oid_registry nfs_acl lockd grace sunrpc xfs libcrc32c joydev ast i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm i40e i2c_core aacraid ptp pps_core
[Tue Dec 11 03:10:46 2018] CPU: 126 PID: 5607 Comm: java Not tainted 4.14.49-3.ppc64le #1
[Tue Dec 11 03:10:46 2018] task: c0000007af4cfc00 task.stack: c0000010f0964000
[Tue Dec 11 03:10:46 2018] NIP:  c000000000356e9c LR: c000000000356e5c CTR: 0000000000000000
[Tue Dec 11 03:10:46 2018] REGS: c00000003fa17d80 TRAP: 0900   Not tainted  (4.14.49-3.ppc64le)
[Tue Dec 11 03:10:46 2018] MSR:  900000000280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 28888424  XER: 20040000
[Tue Dec 11 03:10:46 2018] CFAR: c0000000001d65d4 SOFTE: 0 
GPR00: c000000000356e5c c0000010f09673e0 c000000001413000 0000000000000001 
GPR04: c0002000d8733000 c0000010f0967450 0000000000000000 000000000005e2ab 
GPR08: 0000000000000000 c009dfe30de25698 0000201cc6380000 0000000000045d86 
GPR12: 0000000028242424 c00000000fd96a00 0000000000000000 c000001fe910fd70 
GPR16: 0000000000000003 0000000000000000 00000000014012c0 0000000000000000 
GPR20: c0000010f0967520 c000001fe910fd60 000000000000a382 c000000ebaea3800 
GPR24: 000000000000001c c0002000d87330c0 c0000010f0967450 c00020004c72f800 
GPR28: c0002002e215e000 c0002000d8733000 c000000ebaea3930 c0002003135b6000 
[Tue Dec 11 03:10:46 2018] NIP [c000000000356e9c] mem_cgroup_iter+0x24c/0x490
[Tue Dec 11 03:10:46 2018] LR [c000000000356e5c] mem_cgroup_iter+0x20c/0x490
[Tue Dec 11 03:10:46 2018] Call Trace:
[Tue Dec 11 03:10:46 2018] [c0000010f09673e0] [c000000000356e5c] mem_cgroup_iter+0x20c/0x490 (unreliable)
[Tue Dec 11 03:10:46 2018] [c0000010f0967430] [c00000000035c360] mem_cgroup_soft_limit_reclaim+0x160/0x480
[Tue Dec 11 03:10:46 2018] [c0000010f0967500] [c0000000002ae700] do_try_to_free_pages+0x1a0/0x4b0
[Tue Dec 11 03:10:46 2018] [c0000010f09675a0] [c0000000002aeb40] try_to_free_pages+0x130/0x2b0
[Tue Dec 11 03:10:46 2018] [c0000010f0967630] [c000000000294d54] __alloc_pages_nodemask+0x714/0x1080
[Tue Dec 11 03:10:46 2018] [c0000010f0967820] [c00000000031fbd4] alloc_pages_current+0xa4/0x130
[Tue Dec 11 03:10:46 2018] [c0000010f0967860] [c00000000032e168] new_slab+0x2d8/0x710
[Tue Dec 11 03:10:46 2018] [c0000010f0967930] [c0000000003312e4] ___slab_alloc+0x574/0x770
[Tue Dec 11 03:10:46 2018] [c0000010f0967a60] [c000000000336bb8] __slab_alloc+0x38/0x64
[Tue Dec 11 03:10:46 2018] [c0000010f0967a90] [c000000000331628] kmem_cache_alloc+0x148/0x300
[Tue Dec 11 03:10:46 2018] [c0000010f0967af0] [c000000000375a74] get_empty_filp+0x84/0x210
[Tue Dec 11 03:10:46 2018] [c0000010f0967b70] [c000000000388928] path_openat+0x88/0x12f0
[Tue Dec 11 03:10:46 2018] [c0000010f0967c90] [c00000000038b310] do_filp_open+0x80/0x100
[Tue Dec 11 03:10:46 2018] [c0000010f0967dc0] [c00000000036fd40] do_sys_open+0x1c0/0x2e0
[Tue Dec 11 03:10:46 2018] [c0000010f0967e30] [c00000000000b9e0] system_call+0x58/0x6c
[Tue Dec 11 03:10:46 2018] Instruction dump:
[Tue Dec 11 03:10:46 2018] 418a0104 419e0050 8123006c 792a07e1 40820044 e9230018 792a07a1 40c201a0 
[Tue Dec 11 03:10:46 2018] 39400000 886d028a 994d028a e94d0030 <7d09502a> 39080001 7d09512a 4bcbf181 
[Tue Dec 11 03:10:46 2018] Watchdog CPU:126 became unstuck
[Tue Dec 11 03:11:00 2018] Watchdog CPU:122 Hard LOCKUP
[Tue Dec 11 03:11:00 2018] Modules linked in: binfmt_misc i2c_dev ses enclosure scsi_transport_sas at24 ofpart powernv_flash opal_prd ipmi_powernv i2c_opal ipmi_devintf ipmi_msghandler mtd nfsd auth_rpcgss oid_registry nfs_acl lockd grace sunrpc xfs libcrc32c joydev ast i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm i40e i2c_core aacraid ptp pps_core
[Tue Dec 11 03:11:00 2018] CPU: 122 PID: 620 Comm: migration/122 Not tainted 4.14.49-3.ppc64le #1
[Tue Dec 11 03:11:00 2018] task: c000201cb5c1a400 task.stack: c000201cb5d9c000
[Tue Dec 11 03:11:00 2018] NIP:  c00000000014778c LR: c000000000024fec CTR: c000000000152060
[Tue Dec 11 03:11:00 2018] REGS: c00000003fa47d80 TRAP: 0900   Not tainted  (4.14.49-3.ppc64le)
[Tue Dec 11 03:11:00 2018] MSR:  900000000280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 24000024  XER: 00000000
[Tue Dec 11 03:11:00 2018] CFAR: c000000000024fe8 SOFTE: 0 
GPR00: c000000000024fec c000201cb5d9fb20 c000000001413000 c000000bfc1d6500 
GPR04: 000000000097dee8 0000000000000002 c00000000fc5e618 0000000000000000 
GPR08: 0000000000000000 00000000004dc355 000000000097dee8 c000201cc7245910 
GPR12: 0000000000000000 c00000000fd93e00 c000000000130968 c000001ff5318380 
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
GPR20: 0000000000000000 0000000000000000 c000000000b2599c 0000201cc62c0000 
GPR24: c000000000b25210 0000000000000000 c000201cb5c1a9f8 0000201cc62c0000 
GPR28: c000000bfc1d6500 c000201cc7245880 c000000bfc1d6500 c000000bfc1d6500 
[Tue Dec 11 03:11:00 2018] NIP [c00000000014778c] account_system_index_time+0x2c/0xd0
[Tue Dec 11 03:11:00 2018] LR [c000000000024fec] vtime_flush+0x18c/0x220
[Tue Dec 11 03:11:00 2018] Call Trace:
[Tue Dec 11 03:11:00 2018] [c000201cb5d9fb20] [c000000bfc1d6500] 0xc000000bfc1d6500 (unreliable)
[Tue Dec 11 03:11:00 2018] [c000201cb5d9fb50] [c000000000024fec] vtime_flush+0x18c/0x220
[Tue Dec 11 03:11:00 2018] [c000201cb5d9fb80] [c000000000147bd4] vtime_common_task_switch+0x44/0x90
[Tue Dec 11 03:11:00 2018] [c000201cb5d9fbb0] [c00000000013cf14] finish_task_switch+0x84/0x330
[Tue Dec 11 03:11:00 2018] [c000201cb5d9fc60] [c000000000b25210] __schedule+0x2b0/0x9f0
[Tue Dec 11 03:11:00 2018] [c000201cb5d9fd30] [c000000000b2599c] schedule+0x4c/0xc0
[Tue Dec 11 03:11:00 2018] [c000201cb5d9fd60] [c000000000137024] smpboot_thread_fn+0x214/0x2b0
[Tue Dec 11 03:11:00 2018] [c000201cb5d9fdc0] [c000000000130acc] kthread+0x16c/0x1b0
[Tue Dec 11 03:11:00 2018] [c000201cb5d9fe30] [c00000000000bdd0] ret_from_kernel_thread+0x5c/0x8c
[Tue Dec 11 03:11:00 2018] Instruction dump:
[Tue Dec 11 03:11:00 2018] 60420000 3c4c012d 3842b8a0 7c0802a6 f8010010 60000000 7c0802a6 fbe1fff8 
[Tue Dec 11 03:11:00 2018] f8010010 f821ffd1 7c7f1b78 7c8a2378 <e90305d8> e92306d0 7d082214 f90305d8 
[Tue Dec 11 03:11:00 2018] Watchdog CPU:122 became unstuck
[Tue Dec 11 03:11:54 2018] java: page allocation stalls for 11620ms, order:0, mode:0x1c200ca(GFP_HIGHUSER_MOVABLE|__GFP_WRITE), nodemask=(null)
[Tue Dec 11 03:11:54 2018] java cpuset=/ mems_allowed=0,8
[Tue Dec 11 03:11:54 2018] CPU: 60 PID: 9205 Comm: java Not tainted 4.14.49-3.ppc64le #1
[Tue Dec 11 03:11:54 2018] Call Trace:
[Tue Dec 11 03:11:54 2018] [c000000061cdf6f0] [c000000000b0b8bc] dump_stack+0xb0/0xf4 (unreliable)
[Tue Dec 11 03:11:54 2018] [c000000061cdf730] [c0000000002944e8] warn_alloc+0x128/0x1c0
[Tue Dec 11 03:11:54 2018] [c000000061cdf7e0] [c000000000295024] __alloc_pages_nodemask+0x9e4/0x1080
[Tue Dec 11 03:11:54 2018] [c000000061cdf9d0] [c00000000031fbd4] alloc_pages_current+0xa4/0x130
[Tue Dec 11 03:11:54 2018] [c000000061cdfa10] [c000000000282f38] __page_cache_alloc+0xf8/0x170
[Tue Dec 11 03:11:54 2018] [c000000061cdfa40] [c0000000002844e4] pagecache_get_page+0xc4/0x3d0
[Tue Dec 11 03:11:54 2018] [c000000061cdfaa0] [c000000000285f8c] grab_cache_page_write_begin+0x3c/0x70
[Tue Dec 11 03:11:54 2018] [c000000061cdfad0] [c000000000467a38] ext4_da_write_begin+0x118/0x550
[Tue Dec 11 03:11:54 2018] [c000000061cdfb90] [c000000000286208] generic_perform_write+0xf8/0x260
[Tue Dec 11 03:11:54 2018] [c000000061cdfc20] [c000000000289660] __generic_file_write_iter+0x200/0x240
[Tue Dec 11 03:11:54 2018] [c000000061cdfc80] [c00000000044fb5c] ext4_file_write_iter+0x18c/0x4a0
[Tue Dec 11 03:11:54 2018] [c000000061cdfd00] [c000000000373a50] __vfs_write+0x130/0x1f0
[Tue Dec 11 03:11:54 2018] [c000000061cdfd90] [c000000000373d44] vfs_write+0xd4/0x250
[Tue Dec 11 03:11:54 2018] [c000000061cdfde0] [c00000000037409c] SyS_write+0x6c/0x110
[Tue Dec 11 03:11:54 2018] [c000000061cdfe30] [c00000000000b9e0] system_call+0x58/0x6c
[Tue Dec 11 03:11:54 2018] Mem-Info:
[Tue Dec 11 03:11:54 2018] active_anon:3254359 inactive_anon:109281 isolated_anon:0
 active_file:4319 inactive_file:383301 isolated_file:160
 unevictable:0 dirty:28778 writeback:0 unstable:0
 slab_reclaimable:27749 slab_unreclaimable:59901
 mapped:1496 shmem:248 pagetables:8319 bounce:0
 free:5901 free_pcp:215 free_cma:0
[Tue Dec 11 03:11:54 2018] Node 0 active_anon:97722688kB inactive_anon:3471936kB active_file:230592kB inactive_file:10390592kB unevictable:0kB isolated(anon):0kB isolated(file):2048kB mapped:84992kB dirty:798080kB writeback:0kB shmem:14144kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 12152832kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
[Tue Dec 11 03:11:54 2018] Node 8 active_anon:110556288kB inactive_anon:3522048kB active_file:45824kB inactive_file:14132480kB unevictable:0kB isolated(anon):0kB isolated(file):8192kB mapped:10752kB dirty:1043712kB writeback:0kB shmem:1728kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 34891776kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
[Tue Dec 11 03:11:54 2018] Node 0 DMA free:185664kB min:180096kB low:313920kB high:447744kB active_anon:97722688kB inactive_anon:3471936kB active_file:229696kB inactive_file:10374208kB unevictable:0kB writepending:813312kB present:134217728kB managed:133882304kB mlocked:0kB kernel_stack:168016kB pagetables:276416kB bounce:0kB free_pcp:7296kB local_pcp:320kB free_cma:0kB
[Tue Dec 11 03:11:54 2018] lowmem_reserve[]: 0 0 0 0
[Tue Dec 11 03:11:54 2018] Node 8 DMA free:199680kB min:180288kB low:314176kB high:448064kB active_anon:110556672kB inactive_anon:3522048kB active_file:45824kB inactive_file:14140160kB unevictable:0kB writepending:1069824kB present:134217728kB managed:133946496kB mlocked:0kB kernel_stack:44832kB pagetables:256000kB bounce:0kB free_pcp:6464kB local_pcp:320kB free_cma:0kB
[Tue Dec 11 03:11:54 2018] lowmem_reserve[]: 0 0 0 0
[Tue Dec 11 03:11:54 2018] Node 0 DMA: 1819*64kB (UME) 541*128kB (UE) 14*256kB (U) 1*512kB (U) 0*1024kB 0*2048kB 0*4096kB 0*8192kB 0*16384kB = 189760kB
[Tue Dec 11 03:11:54 2018] Node 8 DMA: 1329*64kB (UMEC) 760*128kB (UE) 62*256kB (UE) 1*512kB (E) 0*1024kB 0*2048kB 0*4096kB 0*8192kB 0*16384kB = 198720kB
[Tue Dec 11 03:11:54 2018] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[Tue Dec 11 03:11:54 2018] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[Tue Dec 11 03:11:54 2018] Node 8 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[Tue Dec 11 03:11:54 2018] Node 8 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[Tue Dec 11 03:11:54 2018] 388684 total pagecache pages
[Tue Dec 11 03:11:54 2018] 960 pages in swap cache
[Tue Dec 11 03:11:54 2018] Swap cache stats: add 876807, delete 875769, find 456053/587622
[Tue Dec 11 03:11:54 2018] Free swap  = 3020608kB
[Tue Dec 11 03:11:54 2018] Total swap = 4194240kB
[Tue Dec 11 03:11:54 2018] 4194304 pages RAM
[Tue Dec 11 03:11:54 2018] 0 pages HighMem/MovableOnly
[Tue Dec 11 03:11:54 2018] 9479 pages reserved
[Tue Dec 11 03:11:54 2018] 209920 pages cma reserved
[Tue Dec 11 03:11:54 2018] 0 pages hwpoisoned
liyi-ibm commented 5 years ago

[root@datanode11 /cgroup/cpu/hadoop-yarn]# cat cpu.cfs_period_us 100000 [root@datanode11 /cgroup/cpu/hadoop-yarn]# cat cpu.cfs_quota_us 17600000 [root@datanode11 /cgroup/cpu/hadoop-yarn]# Message from syslogd@datanode11 at Dec 11 03:10:35 ... kernel:Watchdog CPU:126 Hard LOCKUP

Message from syslogd@datanode11 at Dec 11 03:10:35 ... kernel:Watchdog CPU:126 became unstuck

Message from syslogd@datanode11 at Dec 11 03:10:49 ... kernel:Watchdog CPU:122 Hard LOCKUP

Message from syslogd@datanode11 at Dec 11 03:10:49 ... kernel:Watchdog CPU:122 became unstuck

liyi-ibm commented 5 years ago

The dmesg is memory related lockup. You need patched kernel. CPU related lockup will have unthrottle_cfs_rq and other cfs functions in backtrace. Memory related lockup will have alloc_pages and mem_cgroup in the backtrace. They are different scenarios. You certainly need to mitigate the memory cgroup lockup with the patched kernel before running heavy workload and then use my cpu.cfs_xx settings to avoid the cpu scheduler (cfs) related lockup.