Open masamichitakagi opened 3 years ago
髙木さん、
張です。
ジョブ一式(QJ210121-006.tar.gz)提供します。 ○実行前 1.tar の解凍 LN のFEFSファイルシステム上で上記tar.gz ファイルを展開してください。 $tar xvfz QJ210121-006.tar.gz 2.設定 conf/loop_peta.conf のCLST,RU,RG,LANG_VER,PJMLANGPATHは、 それぞれ、クラスタ名、リソースユニット、リソースグループ、コンパイラのパス になっています。 実際の計算器環境に合わせて変更してください。 script/mck*.shのjobenvの値 は、計算機環境のジョブ実行環境名に合わせて変更してください。
○実行方法 上で解凍としてできた以下のスクリプトを実行してください。 $run_mck.sh
○その他 バッチジョブのcf2_peta_lpf_mck.shだけ実行したい時 以下のように変更してから実行してください。 1.run_mck.sh の以下のmck_batch.sh だけ#コメントを外す。 例. $SCR_DIR/mck_batch.sh 30 20
2.script/mck_batch.sh のfor i in をコメントにして、以下を追加
修正前:
for i in egrep -v '(^#)' ${BASEDIR}/conf/mckjob.list
修正後:
for i in "/mpitest/cf2_peta_lpf_mck.sh"
egrep -v '(^#)' ${BASEDIR}/conf/mckjob.list
○その他 job下の各プログラムをリコンパイルするとき jobディレクトリ下のジョブスクリプトのあるディレクトりで、cmp_pata.sh のコンパイラパスを変更して、 cmp_peta.sh を実行してください。 例. $cd job/mpitest $vi ./cmp_pata.sh $./cmp_peta.sh
h2. 調査結果
再現できていないため推測で失礼いたします。 Linuxでは発生しないため、McKernelの問題と考えられます。
h2. Aに対する回避策
mallocする際に6GBではなくて、1GBを6回など分割して行う。 こうすることで物理連続が見つからずOOMになることを防げる。
h2. メモ
Linuxのkmsg
[ 2728.735301] rus_vm_fault: error inserting mapping for 0x0x1006f22f0000 (req: TID: 116, syscall: 66) error: -16, vm_start: 0x0, vm_end: 0x400000000000, pgsize: 65536, ind: 0 [ 2728.751649] rus_vm_fault: vm_insert_pfn returned -16
McKernelのkmsgで以下が出力されたが、Linuxは落ちなかった。
[ 0]: boot_param_size: 65536 [ 0]: %: GICv3 [ 0]: setup_arm64 done. IHK/McKernel started. [ 0]: ns_per_tsc: 10000 [ 0]: KCommand Line: hidos dump_level=24 allow_oversubscribe time_sharing [ 0]: Physical memory: 0xb0000000 - 0xd2000000, 570425344 bytes, 8704 pages available @ NUMA: 0 [ 0]: Physical memory: 0xe0000000 - 0xfa800000, 444596224 bytes, 6784 pages available @ NUMA: 0 [ 0]: Physical memory: 0xfec00000 - 0xffc00000, 16777216 bytes, 256 pages available @ NUMA: 0 [ 0]: Physical memory: 0x8000300000 - 0x8080000000, 2144337920 bytes, 32720 pages available @ NUMA: 0 [ 0]: Physical memory: 0x8100000000 - 0x811fc00000, 532676608 bytes, 8128 pages available @ NUMA: 0 [ 0]: Physical memory: 0x8120800000 - 0x81a0000000, 2139095040 bytes, 32640 pages available @ NUMA: 0 [ 0]: Physical memory: 0x81a0400000 - 0x81c0000000, 532676608 bytes, 8128 pages available @ NUMA: 0 [ 0]: Physical memory: 0x81c2800000 - 0x81ff800000, 1023410176 bytes, 15616 pages available @ NUMA: 0 [ 0]: Physical memory: 0x8828400000 - 0x8829800000, 20971520 bytes, 320 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8829c00000 - 0x882b000000, 20971520 bytes, 320 pages available @ NUMA: 1 [ 0]: Physical memory: 0x882b400000 - 0x882b800000, 4194304 bytes, 64 pages available @ NUMA: 1 [ 0]: Physical memory: 0x882bc00000 - 0x886e400000, 1115684864 bytes, 17024 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8880000000 - 0x88b1000000, 822083584 bytes, 12544 pages available @ NUMA: 1 [ 0]: Physical memory: 0x88b1400000 - 0x88d1000000, 532676608 bytes, 8128 pages available @ NUMA: 1 [ 0]: Physical memory: 0x88d1400000 - 0x88d9000000, 130023424 bytes, 1984 pages available @ NUMA: 1 [ 0]: Physical memory: 0x88d9400000 - 0x88fe800000, 624951296 bytes, 9536 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8900000000 - 0x891fc00000, 532676608 bytes, 8128 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8920800000 - 0x8953400000, 851443712 bytes, 12992 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8953800000 - 0x8957400000, 62914560 bytes, 960 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8957800000 - 0x8958000000, 8388608 bytes, 128 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8958800000 - 0x8973800000, 452984832 bytes, 6912 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8973c00000 - 0x8998400000, 612368384 bytes, 9344 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8998800000 - 0x89a0000000, 125829120 bytes, 1920 pages available @ NUMA: 1 [ 0]: Physical memory: 0x89a0400000 - 0x89c0000000, 532676608 bytes, 8128 pages available @ NUMA: 1 [ 0]: Physical memory: 0x89c2800000 - 0x89e1400000, 515899392 bytes, 7872 pages available @ NUMA: 1 [ 0]: Physical memory: 0x89e1800000 - 0x89ff800000, 503316480 bytes, 7680 pages available @ NUMA: 1 [ 0]: Physical memory: 0x9028000000 - 0x9029400000, 20971520 bytes, 320 pages available @ NUMA: 2 [ 0]: Physical memory: 0x9029800000 - 0x902b400000, 29360128 bytes, 448 pages available @ NUMA: 2 [ 0]: Physical memory: 0x902c800000 - 0x911fc00000, 4081057792 bytes, 62272 pages available @ NUMA: 2 [ 0]: Physical memory: 0x9120800000 - 0x916b800000, 1258291200 bytes, 19200 pages available @ NUMA: 2 [ 0]: Physical memory: 0x9180000000 - 0x9198c00000, 415236096 bytes, 6336 pages available @ NUMA: 2 [ 0]: Physical memory: 0x9199000000 - 0x91a0000000, 117440512 bytes, 1792 pages available @ NUMA: 2 [ 0]: Physical memory: 0x91a0400000 - 0x91c0000000, 532676608 bytes, 8128 pages available @ NUMA: 2 [ 0]: Physical memory: 0x91c2c00000 - 0x91e4000000, 557842432 bytes, 8512 pages available @ NUMA: 2 [ 0]: Physical memory: 0x91e4400000 - 0x91ef000000, 180355072 bytes, 2752 pages available @ NUMA: 2 [ 0]: Physical memory: 0x91ef400000 - 0x91ff800000, 272629760 bytes, 4160 pages available @ NUMA: 2 [ 0]: Physical memory: 0x9830000000 - 0x98fe400000, 3460300800 bytes, 52800 pages available @ NUMA: 3 [ 0]: Physical memory: 0x98fec00000 - 0x995fc00000, 1627389952 bytes, 24832 pages available @ NUMA: 3 [ 0]: Physical memory: 0x9960000000 - 0x9971c00000, 297795584 bytes, 4544 pages available @ NUMA: 3 [ 0]: Physical memory: 0x9980400000 - 0x99a0000000, 532676608 bytes, 8128 pages available @ NUMA: 3 [ 0]: Physical memory: 0x99a2c00000 - 0x99de400000, 998244352 bytes, 15232 pages available @ NUMA: 3 [ 0]: Physical memory: 0x99de800000 - 0x99fe400000, 532676608 bytes, 8128 pages available @ NUMA: 3 [ 0]: Physical memory: 0x99fe800000 - 0x99ff000000, 8388608 bytes, 128 pages available @ NUMA: 3 [ 0]: NUMA: 0, Linux NUMA: 4, type: 1, available bytes: 7403995136, pages: 112976 [ 0]: NUMA: 1, Linux NUMA: 5, type: 1, available bytes: 7470055424, pages: 113984 [ 0]: NUMA: 2, Linux NUMA: 6, type: 1, available bytes: 7465861120, pages: 113920 [ 0]: NUMA: 3, Linux NUMA: 7, type: 1, available bytes: 7457472512, pages: 113792 [ 0]: NUMA 0 distances: 0 (10), 1 (20), 2 (30), 3 (30), [ 0]: NUMA 1 distances: 1 (10), 0 (20), 2 (30), 3 (30), [ 0]: NUMA 2 distances: 2 (10), 3 (20), 0 (30), 1 (30), [ 0]: NUMA 3 distances: 3 (10), 2 (20), 0 (30), 1 (30), [ 0]: Trampoline area: 0x0 [ 0]: # of cpus : 48 [ 0]: locals = ffff800030080000 [ 0]: BSP: 0 (HW ID: 12 @ NUMA 0) [ 0]: SVE: maximum available vector length 64 bytes per vector [ 0]: SVE: default vector length 64 bytes per vector [ 0]: BSP: booted 47 AP CPUs [ 0]: Master channel init acked. [ 0]: Using Linux work IRQ for IKC IPI. [ 0]: Enable Host mapping vDSO. [ 0]: tof_utofu_init_globals: linux_vmalloc_start: ffff000010000000 [ 0]: Tofu globals initialized. IHK/McKernel booted. [ 36]: sys_mmap(0,0,0,0,0,0):EINVAL [ 24]: sys_mmap(0,0,0,0,0,0):EINVAL [ 12]: sys_mmap(0,0,0,0,0,0):EINVAL [ 0]: sys_mmap(0,0,0,0,0,0):EINVAL [ 36]: hugefileobj_create: obj: 0xffff8097b190b500, VA: 0x0, page array allocated for 14209 pages, pagesize: 2097152 [ 24]: hugefileobj_create: obj: 0xffff808fa934a3e0, VA: 0x0, page array allocated for 14209 pages, pagesize: 2097152 [ 12]: hugefileobj_create: obj: 0xffff8087a877c440, VA: 0x0, page array allocated for 14209 pages, pagesize: 2097152 [ 0]: hugefileobj_create: obj: 0xffff800032afd960, VA: 0x0, page array allocated for 14209 pages, pagesize: 2097152 [ 0]: CPU0: shutdown. [ 1]: CPU1: shutdown. [ 2]: CPU2: shutdown. [ 3]: CPU3: shutdown. [ 4]: CPU4: shutdown. [ 5]: CPU5: shutdown. [ 6]: CPU6: shutdown. [ 0]: CPU0: shutdown. [ 1]: CPU1: shutdown. [ 2]: CPU2: shutdown. [ 3]: CPU3: shutdown. [ 4]: CPU4: shutdown. [ 5]: CPU5: shutdown. [ 6]: CPU6: shutdown. [ 7]: CPU7: shutdown. [ 8]: CPU8: shutdown. [ 9]: CPU9: shutdown. [ 10]: CPU10: shutdown. [ 11]: CPU11: shutdown. [ 12]: CPU12: shutdown. [ 13]: CPU13: shutdown. [ 14]: CPU14: shutdown. [ 15]: CPU15: shutdown. [ 16]: CPU16: shutdown. [ 17]: CPU17: shutdown. [ 18]: CPU18: shutdown. [ 19]: CPU19: shutdown. [ 20]: CPU20: shutdown. [ 21]: CPU21: shutdown. [ 22]: CPU22: shutdown. [ 23]: CPU23: shutdown. [ 24]: CPU24: shutdown. [ 25]: CPU25: shutdown. [ 26]: CPU26: shutdown. [ 27]: CPU27: shutdown. [ 28]: CPU28: shutdown. [ 29]: CPU29: shutdown. [ 30]: CPU30: shutdown. [ 31]: CPU31: shutdown. [ 32]: CPU32: shutdown. [ 33]: CPU33: shutdown. [ 34]: CPU34: shutdown. [ 35]: CPU35: shutdown. [ 36]: CPU36: shutdown. [ 37]: CPU37: shutdown. [ 38]: CPU38: shutdown. [ 39]: CPU39: shutdown. [ 40]: CPU40: shutdown. [ 41]: CPU41: shutdown. [ 0]: boot_param_size: 65536 [ 0]: %: GICv3 [ 0]: setup_arm64 done. IHK/McKernel started. [ 0]: ns_per_tsc: 10000 [ 0]: KCommand Line: hidos dump_level=24 allow_oversubscribe time_sharing [ 0]: Physical memory: 0xb0000000 - 0xd2000000, 570425344 bytes, 8704 pages available @ NUMA: 0 [ 0]: Physical memory: 0xe0000000 - 0xfa800000, 444596224 bytes, 6784 pages available @ NUMA: 0 [ 0]: Physical memory: 0xfec00000 - 0xffc00000, 16777216 bytes, 256 pages available @ NUMA: 0 [ 0]: Physical memory: 0x8000300000 - 0x8080000000, 2144337920 bytes, 32720 pages available @ NUMA: 0 [ 0]: Physical memory: 0x8100000000 - 0x811fc00000, 532676608 bytes, 8128 pages available @ NUMA: 0 [ 0]: Physical memory: 0x8120800000 - 0x81a0000000, 2139095040 bytes, 32640 pages available @ NUMA: 0 [ 0]: Physical memory: 0x81a0400000 - 0x81c0000000, 532676608 bytes, 8128 pages available @ NUMA: 0 [ 0]: Physical memory: 0x81c2800000 - 0x81ff800000, 1023410176 bytes, 15616 pages available @ NUMA: 0 [ 0]: Physical memory: 0x8828400000 - 0x8829800000, 20971520 bytes, 320 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8829c00000 - 0x882b000000, 20971520 bytes, 320 pages available @ NUMA: 1 [ 0]: Physical memory: 0x882b400000 - 0x882b800000, 4194304 bytes, 64 pages available @ NUMA: 1 [ 0]: Physical memory: 0x882bc00000 - 0x886e400000, 1115684864 bytes, 17024 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8880000000 - 0x88b1000000, 822083584 bytes, 12544 pages available @ NUMA: 1 [ 0]: Physical memory: 0x88b1400000 - 0x88d1000000, 532676608 bytes, 8128 pages available @ NUMA: 1 [ 0]: Physical memory: 0x88d1400000 - 0x88d9000000, 130023424 bytes, 1984 pages available @ NUMA: 1 [ 0]: Physical memory: 0x88d9400000 - 0x88fe800000, 624951296 bytes, 9536 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8900000000 - 0x891fc00000, 532676608 bytes, 8128 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8920800000 - 0x8953400000, 851443712 bytes, 12992 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8953800000 - 0x8957400000, 62914560 bytes, 960 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8957800000 - 0x8958000000, 8388608 bytes, 128 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8958800000 - 0x8973800000, 452984832 bytes, 6912 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8973c00000 - 0x8998400000, 612368384 bytes, 9344 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8998800000 - 0x89a0000000, 125829120 bytes, 1920 pages available @ NUMA: 1 [ 0]: Physical memory: 0x89a0400000 - 0x89c0000000, 532676608 bytes, 8128 pages available @ NUMA: 1 [ 0]: Physical memory: 0x89c2800000 - 0x89e1400000, 515899392 bytes, 7872 pages available @ NUMA: 1 [ 0]: Physical memory: 0x89e1800000 - 0x89ff800000, 503316480 bytes, 7680 pages available @ NUMA: 1 [ 0]: Physical memory: 0x9028000000 - 0x9029400000, 20971520 bytes, 320 pages available @ NUMA: 2 [ 0]: Physical memory: 0x9029800000 - 0x902b800000, 33554432 bytes, 512 pages available @ NUMA: 2 [ 0]: Physical memory: 0x902c800000 - 0x911fc00000, 4081057792 bytes, 62272 pages available @ NUMA: 2 [ 0]: Physical memory: 0x9120800000 - 0x916b400000, 1254096896 bytes, 19136 pages available @ NUMA: 2 [ 0]: Physical memory: 0x9180000000 - 0x9198c00000, 415236096 bytes, 6336 pages available @ NUMA: 2 [ 0]: Physical memory: 0x9199000000 - 0x91a0000000, 117440512 bytes, 1792 pages available @ NUMA: 2 [ 0]: Physical memory: 0x91a0400000 - 0x91c0000000, 532676608 bytes, 8128 pages available @ NUMA: 2 [ 0]: Physical memory: 0x91c2c00000 - 0x91e4000000, 557842432 bytes, 8512 pages available @ NUMA: 2 [ 0]: Physical memory: 0x91e4400000 - 0x91ef000000, 180355072 bytes, 2752 pages available @ NUMA: 2 [ 0]: Physical memory: 0x91ef400000 - 0x91ff800000, 272629760 bytes, 4160 pages available @ NUMA: 2 [ 0]: Physical memory: 0x9830000000 - 0x98fe400000, 3460300800 bytes, 52800 pages available @ NUMA: 3 [ 0]: Physical memory: 0x98fec00000 - 0x995fc00000, 1627389952 bytes, 24832 pages available @ NUMA: 3 [ 0]: Physical memory: 0x9960000000 - 0x9971c00000, 297795584 bytes, 4544 pages available @ NUMA: 3 [ 0]: Physical memory: 0x9980400000 - 0x99a0000000, 532676608 bytes, 8128 pages available @ NUMA: 3 [ 0]: Physical memory: 0x99a2c00000 - 0x99de400000, 998244352 bytes, 15232 pages available @ NUMA: 3 [ 0]: Physical memory: 0x99de800000 - 0x99fe400000, 532676608 bytes, 8128 pages available @ NUMA: 3 [ 0]: Physical memory: 0x99fe800000 - 0x99ff000000, 8388608 bytes, 128 pages available @ NUMA: 3 [ 0]: NUMA: 0, Linux NUMA: 4, type: 1, available bytes: 7403995136, pages: 112976 [ 0]: NUMA: 1, Linux NUMA: 5, type: 1, available bytes: 7470055424, pages: 113984 [ 0]: NUMA: 2, Linux NUMA: 6, type: 1, available bytes: 7465861120, pages: 113920 [ 0]: NUMA: 3, Linux NUMA: 7, type: 1, available bytes: 7457472512, pages: 113792 [ 0]: NUMA 0 distances: 0 (10), 1 (20), 2 (30), 3 (30), [ 0]: NUMA 1 distances: 1 (10), 0 (20), 2 (30), 3 (30), [ 0]: NUMA 2 distances: 2 (10), 3 (20), 0 (30), 1 (30), [ 0]: NUMA 3 distances: 3 (10), 2 (20), 0 (30), 1 (30), [ 0]: Trampoline area: 0x0 [ 0]: # of cpus : 48 [ 0]: locals = ffff800030080000 [ 0]: BSP: 0 (HW ID: 12 @ NUMA 0) [ 0]: SVE: maximum available vector length 64 bytes per vector [ 0]: SVE: default vector length 64 bytes per vector [ 0]: BSP: booted 47 AP CPUs [ 0]: Master channel init acked. [ 0]: Using Linux work IRQ for IKC IPI. [ 0]: Enable Host mapping vDSO. [ 0]: tof_utofu_init_globals: linux_vmalloc_start: ffff000010000000 [ 0]: Tofu globals initialized. IHK/McKernel booted. [ 0]: sys_mmap(0,0,0,0,0,0):EINVAL [ 36]: sys_mmap(0,0,0,0,0,0):EINVAL [ 12]: sys_mmap(0,0,0,0,0,0):EINVAL [ 24]: sys_mmap(0,0,0,0,0,0):EINVAL [ 36]: hugefileobj_create: obj: 0xffff8097b1a7a700, VA: 0x0, page array allocated for 14209 pages, pagesize: 2097152 [ 0]: hugefileobj_create: obj: 0xffff800032cfaa20, VA: 0x0, page array allocated for 14209 pages, pagesize: 2097152 [ 12]: hugefileobj_create: obj: 0xffff8087a854c340, VA: 0x0, page array allocated for 14209 pages, pagesize: 2097152 [ 24]: hugefileobj_create: obj: 0xffff808fa929e280, VA: 0x0, page array allocated for 14209 pages, pagesize: 2097152 [ 24]: rusage_check_oom: memory used:29793386496 available:29797384192 [ 24]: rusage_check_oom: memory used:29793386496 available:29797384192 [ 24]: rusage_check_oom: memory used:29793386496 available:29797384192 [ 24]: rusage_check_oom: memory used:29793386496 available:29797384192 [ 24]: rusage_check_oom: memory used:29793386496 available:29797384192 [ 24]: rusage_check_oom: memory used:29793386496 available:29797384192 [ 24]: rusage_check_oom: memory used:29793386496 available:29797384192 [ 24]: rusage_check_oom: memory used:29793386496 available:29797384192 [ 24]: rusage_check_oom: memory used:29793386496 available:29797384192 [ 24]: rusage_check_oom: memory used:29793386496 available:29797384192 [ 24]: rusage_check_oom: memory used:29793386496 available:29797384192 [ 24]: rusage_check_oom: memory used:29793386496 available:29797384192 [ 24]: rusage_check_oom: memory used:29793386496 available:29797384192 [ 24]: rusage_check_oom: memory used:29793386496 available:29797384192 [ 24]: rusage_check_oom: memory used:29793386496 available:29797384192 [ 24]: rusage_check_oom: memory used:29793386496 available:29797384192 [ 24]: page_fault_process_memory_range(ffff808fa81e4ca0,100a66600000-100a66800000 733004,100a66700018,40000006):cannot allocate new page. -12 [ 24]: page_fault_handler fault VM failed for TID: 71, addr: 0x100a66700018, reason: 6, error: -12 [ 24]: Page fault for 0x100a66700018 [ 24]: no page found for write access in user mode (reserved bit wasn't set), it wasn't an instruction fetch [ 24]: address is in range, flag: 0x733004 () [ 24]: ihk_mc_pt_print_pte: 0x100a66700018, CONFIG_ARM64_PGTABLE_LEVELS: 3, ptl4_index: 0, ptl3_index: 4, ptl2_index: 83, ptl1_index: 1648 [ 24]: l4 table: 0x90285C0000 l4idx: 0 [ 24]: l4 entry: 0x3 [ 24]: l3 table: 0x90286C0000 l3idx: 4 [ 24]: l3 entry: 0x90286C0003 [ 24]: l2 table: 0x91DA240000 l2idx: 83 [ 24]: l2 entry: 0x91DA240003 [ 24]: l1 table: 0x0 l1idx: 1648 [ 24]: 0x100A66700018 l1idx not present! [ 24]: l1 entry: 0x0 [ 24]: PC: 0x100000d1b250 (b250 in /opt/FJSVxos/mmm/lib64/libmpg.so.1) [ 24]: dump pt_regs: [ 24]: x0 : 00000000000ffff1 x1 : 0000100a66700010 x2 : 0000000000100014 x3 : 0000100000d446b8 [ 24]: x4 : 0000100000d44788 x5 : 00000000000fffd1 x6 : 0000100a66400000 x7 : 0000000000000001 [ 24]: x8 : 00000000000000de x9 : 0000000000000083 x10 : 000000000000008e x11 : 00003ffffffff840 [ 24]: x12 : 0000000000000000 x13 : 0000100000d11470 x14 : 0000000000000000 x15 : 1e042f3a1c140200 [ 24]: x16 : 0000100000d44668 x17 : 0000100000e5b5a0 x18 : 0000100000658850 x19 : 0000100000d44788 [ 24]: x20 : 0000000000100011 x21 : 00000000000ffff0 x22 : 0000100a66600010 x23 : 0000100000d44788 [ 24]: x24 : 0000000000100030 x25 : 0000100000d44788 x26 : 0000000000000000 x27 : 0000000000100010 [ 24]: x28 : 0000100000d44788 x29 : 00003ffffffff6c0 x30 : 0000100000d1b714 [ 24]: sp : 00003ffffffff6c0 [ 24]: pc : 0000100000d1b250 [ 24]: pstate : 0000000060000000(N:0 Z:1 C:1 V:0 SS:0 IL:0 D:0 A:0 I:0 F:0 M[4]:0 M:0) [ 24]: orig_x0 : 0000000000000000 [ 24]: syscallno : ffffffffffffffff [ 12]: rusage_check_oom: memory used:29793452032 available:29797384192 [ 12]: rusage_check_oom: memory used:29793452032 available:29797384192 [ 12]: rusage_check_oom: memory used:29793452032 available:29797384192 [ 12]: rusage_check_oom: memory used:29793452032 available:29797384192 [ 12]: rusage_check_oom: memory used:29793452032 available:29797384192 [ 12]: rusage_check_oom: memory used:29793452032 available:29797384192 [ 12]: rusage_check_oom: memory used:29793452032 available:29797384192 [ 12]: rusage_check_oom: memory used:29793452032 available:29797384192 [ 12]: rusage_check_oom: memory used:29793452032 available:29797384192 [ 12]: rusage_check_oom: memory used:29793452032 available:29797384192 [ 12]: rusage_check_oom: memory used:29793452032 available:29797384192 [ 12]: rusage_check_oom: memory used:29793452032 available:29797384192 [ 12]: rusage_check_oom: memory used:29793452032 available:29797384192 [ 12]: rusage_check_oom: memory used:29793452032 available:29797384192 [ 12]: rusage_check_oom: memory used:29793452032 available:29797384192 [ 12]: rusage_check_oom: memory used:29793452032 available:29797384192 [ 12]: page_fault_process_memory_range(ffff8087a8544ca0,100a67e00000-100a68000000 733004,100a67f00018,40000006):cannot allocate new page. -12 [ 12]: page_fault_handler fault VM failed for TID: 70, addr: 0x100a67f00018, reason: 6, error: -12 [ 12]: Page fault for 0x100a67f00018 [ 12]: no page found for write access in user mode (reserved bit wasn't set), it wasn't an instruction fetch [ 12]: address is in range, flag: 0x733004 ( ) [ 12]: ihk_mc_pt_print_pte: 0x100a67f00018, CONFIG_ARM64_PGTABLE_LEVELS: 3, ptl4_index: 0, ptl3_index: 4, ptl2_index: 83, ptl1_index: 2032 [ 12]: l4 table: 0x88289C0000 l4idx: 0 [ 12]: l4 entry: 0x3 [ 12]: l3 table: 0x8828AC0000 l3idx: 4 [ 12]: l3 entry: 0x8828AC0003 [ 12]: l2 table: 0x89DB030000 l2idx: 83 [ 12]: l2 entry: 0x89DB030003 [ 12]: l1 table: 0x0 l1idx: 2032 [ 12]: 0x100A67F00018 l1idx not present! [ 12]: l1 entry: 0x0 [ 12]: PC: 0x100000d1b250 (b250 in /opt/FJSVxos/mmm/lib64/libmpg.so.1) [ 12]: dump pt_regs: [ 12]: x0 : 00000000000ffff1 x1 : 0000100a67f00010 x2 : 0000000000100014 x3 : 0000100000d446b8 [ 12]: x4 : 0000100000d44788 x5 : 00000000000fffd1 x6 : 0000100a67c00000 x7 : 0000000000000001 [ 12]: x8 : 00000000000000de x9 : 0000000000000083 x10 : 000000000000008e x11 : 00003ffffffff840 [ 12]: x12 : 0000000000000000 x13 : 0000100000d11470 x14 : 0000000000000000 x15 : 1e042f3a1c140200 [ 12]: x16 : 0000100000d44668 x17 : 0000100000e5b5a0 x18 : 0000100000658850 x19 : 0000100000d44788 [ 12]: x20 : 0000000000100011 x21 : 00000000000ffff0 x22 : 0000100a67e00010 x23 : 0000100000d44788 [ 12]: x24 : 0000000000100030 x25 : 0000100000d44788 x26 : 0000000000000000 x27 : 0000000000100010 [ 12]: x28 : 0000100000d44788 x29 : 00003ffffffff6c0 x30 : 0000100000d1b714 [ 12]: sp : 00003ffffffff6c0 [ 12]: pc : 0000100000d1b250 [ 12]: pstate : 0000000060000000(N:0 Z:1 C:1 V:0 SS:0 IL:0 D:0 A:0 I:0 F:0 M[4]:0 M:0) [ 12]: orig_x0 : 0000000000000000 [ 12]: syscallno : ffffffffffffffff [ 24]: page_fault_handler: PF in PF?? [ 24]: PANIC [ 12]: page_fault_handler: PF in PF?? [ 12]: PANIC [ 0]: coredump: ERROR: do_syscall failed (-512) [ 0]: do_signal: ERROR: coredump failed (-512) [ 0]: SCD_MSG_SEND_SIGNAL: do_kill(pid=71, tid=-1, sig=15)=0 [ 0]: SCD_MSG_SEND_SIGNAL: do_kill(pid=70, tid=-1, sig=15)=0 [ 0]: SCD_MSG_SEND_SIGNAL: do_kill(pid=72, tid=-1, sig=15)=0 [ 0]: coredump: ERROR: do_syscall failed (-512) [ 0]: do_signal: ERROR: coredump failed (-512) [ 0]: SCD_MSG_SEND_SIGNAL: do_kill(pid=71, tid=-1, sig=15)=0 [ 0]: SCD_MSG_SEND_SIGNAL: do_kill(pid=70, tid=-1, sig=15)=0 [ 0]: SCD_MSG_SEND_SIGNAL: do_kill(pid=72, tid=-1, sig=15)=0 [ 0]: coredump: ERROR: do_syscall failed (-512) [ 0]: do_signal: ERROR: coredump failed (-512) [ 0]: SCD_MSG_SEND_SIGNAL: do_kill(pid=71, tid=-1, sig=15)=0 [ 0]: SCD_MSG_SEND_SIGNAL: do_kill(pid=70, tid=-1, sig=15)=0 [ 0]: SCD_MSG_SEND_SIGNAL: do_kill(pid=72, tid=-1, sig=15)=0 [ 0]: coredump: ERROR: do_syscall failed (-512) [ 0]: do_signal: ERROR: coredump failed (-512) [ 0]: SCD_MSG_SEND_SIGNAL: do_kill(pid=71, tid=-1, sig=15)=0 [ 0]: SCD_MSG_SEND_SIGNAL: do_kill(pid=70, tid=-1, sig=15)=0 [ 0]: SCD_MSG_SEND_SIGNAL: do_kill(pid=72, tid=-1, sig=15)=0 [ 0]: coredump: ERROR: do_syscall failed (-512) [ 0]: do_signal: ERROR: coredump failed (-512) [ 0]: SCD_MSG_SEND_SIGNAL: do_kill(pid=71, tid=-1, sig=15)=0 [ 0]: SCD_MSG_SEND_SIGNAL: do_kill(pid=70, tid=-1, sig=15)=0 [ 0]: SCD_MSG_SEND_SIGNAL: do_kill(pid=72, tid=-1, sig=15)=0 [ 0]: coredump: ERROR: do_syscall failed (-512) [ 0]: do_signal: ERROR: coredump failed (-512) [ 0]: SCD_MSG_SEND_SIGNAL: do_kill(pid=71, tid=-1, sig=15)=0 [ 0]: SCD_MSG_SEND_SIGNAL: do_kill(pid=70, tid=-1, sig=15)=0 [ 0]: SCD_MSG_SEND_SIGNAL: do_kill(pid=72, tid=-1, sig=15)=0 [ 0]: CPU0: shutdown. [ 1]: CPU1: shutdown. [ 2]: CPU2: shutdown. [ 3]: CPU3: shutdown. [ 4]: CPU4: shutdown. [ 5]: CPU5: shutdown. [ 6]: CPU6: shutdown. [ 7]: CPU7: shutdown. [ 8]: CPU8: shutdown. [ 9]: CPU9: shutdown. [ 10]: CPU10: shutdown. [ 11]: CPU11: shutdown. [ 12]: CPU12: shutdown. [ 13]: CPU13: shutdown. [ 14]: CPU14: shutdown. [ 15]: CPU15: shutdown. [ 16]: CPU16: shutdown. [ 17]: CPU17: shutdown. [ 18]: CPU18: shutdown. [ 19]: CPU19: shutdown. [ 20]: CPU20: shutdown. [ 21]: CPU21: shutdown. [ 22]: CPU22: shutdown. [ 23]: CPU23: shutdown. [ 24]: CPU24: shutdown. [ 0]: boot_param_size: 65536 [ 0]: %: GICv3 [ 0]: setup_arm64 done. IHK/McKernel started. [ 0]: ns_per_tsc: 10000 [ 0]: KCommand Line: hidos dump_level=24 allow_oversubscribe time_sharing [ 0]: Physical memory: 0xb0000000 - 0xd2000000, 570425344 bytes, 8704 pages available @ NUMA: 0 [ 0]: Physical memory: 0xe0000000 - 0xfa800000, 444596224 bytes, 6784 pages available @ NUMA: 0 [ 0]: Physical memory: 0xfec00000 - 0xffc00000, 16777216 bytes, 256 pages available @ NUMA: 0 [ 0]: Physical memory: 0x8000300000 - 0x8080000000, 2144337920 bytes, 32720 pages available @ NUMA: 0 [ 0]: Physical memory: 0x8100000000 - 0x811fc00000, 532676608 bytes, 8128 pages available @ NUMA: 0 [ 0]: Physical memory: 0x8120800000 - 0x81a0000000, 2139095040 bytes, 32640 pages available @ NUMA: 0 [ 0]: Physical memory: 0x81a0400000 - 0x81c0000000, 532676608 bytes, 8128 pages available @ NUMA: 0 [ 0]: Physical memory: 0x81c2800000 - 0x81ff800000, 1023410176 bytes, 15616 pages available @ NUMA: 0 [ 0]: Physical memory: 0x8828400000 - 0x8829800000, 20971520 bytes, 320 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8829c00000 - 0x882b000000, 20971520 bytes, 320 pages available @ NUMA: 1 [ 0]: Physical memory: 0x882b400000 - 0x882b800000, 4194304 bytes, 64 pages available @ NUMA: 1 [ 0]: Physical memory: 0x882bc00000 - 0x886e400000, 1115684864 bytes, 17024 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8880000000 - 0x88b1000000, 822083584 bytes, 12544 pages available @ NUMA: 1 [ 0]: Physical memory: 0x88b1400000 - 0x88d1000000, 532676608 bytes, 8128 pages available @ NUMA: 1 [ 0]: Physical memory: 0x88d1400000 - 0x88d9000000, 130023424 bytes, 1984 pages available @ NUMA: 1 [ 0]: Physical memory: 0x88d9400000 - 0x88fe800000, 624951296 bytes, 9536 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8900000000 - 0x891fc00000, 532676608 bytes, 8128 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8920800000 - 0x8953400000, 851443712 bytes, 12992 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8953800000 - 0x8957400000, 62914560 bytes, 960 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8957800000 - 0x8958000000, 8388608 bytes, 128 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8958800000 - 0x8973800000, 452984832 bytes, 6912 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8973c00000 - 0x8998400000, 612368384 bytes, 9344 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8998800000 - 0x89a0000000, 125829120 bytes, 1920 pages available @ NUMA: 1 [ 0]: Physical memory: 0x89a0400000 - 0x89c0000000, 532676608 bytes, 8128 pages available @ NUMA: 1 [ 0]: Physical memory: 0x89c2800000 - 0x89e1400000, 515899392 bytes, 7872 pages available @ NUMA: 1 [ 0]: Physical memory: 0x89e1800000 - 0x89ff800000, 503316480 bytes, 7680 pages available @ NUMA: 1 [ 0]: Physical memory: 0x9028000000 - 0x9029400000, 20971520 bytes, 320 pages available @ NUMA: 2 [ 0]: Physical memory: 0x9029800000 - 0x902b800000, 33554432 bytes, 512 pages available @ NUMA: 2 [ 0]: Physical memory: 0x902c800000 - 0x911fc00000, 4081057792 bytes, 62272 pages available @ NUMA: 2 [ 0]: Physical memory: 0x9120800000 - 0x916b400000, 1254096896 bytes, 19136 pages available @ NUMA: 2 [ 0]: Physical memory: 0x9180000000 - 0x9198c00000, 415236096 bytes, 6336 pages available @ NUMA: 2 [ 0]: Physical memory: 0x9199000000 - 0x91a0000000, 117440512 bytes, 1792 pages available @ NUMA: 2 [ 0]: Physical memory: 0x91a0400000 - 0x91c0000000, 532676608 bytes, 8128 pages available @ NUMA: 2 [ 0]: Physical memory: 0x91c2c00000 - 0x91e4000000, 557842432 bytes, 8512 pages available @ NUMA: 2 [ 0]: Physical memory: 0x91e4400000 - 0x91ef000000, 180355072 bytes, 2752 pages available @ NUMA: 2 [ 0]: Physical memory: 0x91ef400000 - 0x91ff800000, 272629760 bytes, 4160 pages available @ NUMA: 2 [ 0]: Physical memory: 0x9830000000 - 0x98fe400000, 3460300800 bytes, 52800 pages available @ NUMA: 3 [ 0]: Physical memory: 0x98fec00000 - 0x995fc00000, 1627389952 bytes, 24832 pages available @ NUMA: 3 [ 0]: Physical memory: 0x9960000000 - 0x9971c00000, 297795584 bytes, 4544 pages available @ NUMA: 3 [ 0]: Physical memory: 0x9980400000 - 0x99a0000000, 532676608 bytes, 8128 pages available @ NUMA: 3 [ 0]: Physical memory: 0x99a2c00000 - 0x99de400000, 998244352 bytes, 15232 pages available @ NUMA: 3 [ 0]: Physical memory: 0x99de800000 - 0x99fe400000, 532676608 bytes, 8128 pages available @ NUMA: 3 [ 0]: Physical memory: 0x99fe800000 - 0x99ff000000, 8388608 bytes, 128 pages available @ NUMA: 3 [ 0]: NUMA: 0, Linux NUMA: 4, type: 1, available bytes: 7403995136, pages: 112976 [ 0]: NUMA: 1, Linux NUMA: 5, type: 1, available bytes: 7470055424, pages: 113984 [ 0]: NUMA: 2, Linux NUMA: 6, type: 1, available bytes: 7465861120, pages: 113920 [ 0]: NUMA: 3, Linux NUMA: 7, type: 1, available bytes: 7457472512, pages: 113792 [ 0]: NUMA 0 distances: 0 (10), 1 (20), 2 (30), 3 (30), [ 0]: NUMA 1 distances: 1 (10), 0 (20), 2 (30), 3 (30), [ 0]: NUMA 2 distances: 2 (10), 3 (20), 0 (30), 1 (30), [ 0]: NUMA 3 distances: 3 (10), 2 (20), 0 (30), 1 (30), [ 0]: Trampoline area: 0x0 [ 0]: # of cpus : 48 [ 0]: locals = ffff800030080000 [ 0]: BSP: 0 (HW ID: 12 @ NUMA 0) [ 0]: SVE: maximum available vector length 64 bytes per vector [ 0]: SVE: default vector length 64 bytes per vector [ 0]: BSP: booted 47 AP CPUs [ 0]: Master channel init acked. [ 0]: Using Linux work IRQ for IKC IPI. [ 0]: Enable Host mapping vDSO. [ 0]: tof_utofu_init_globals: linux_vmalloc_start: ffff000010000000 [ 0]: Tofu globals initialized. IHK/McKernel booted. [ 12]: sys_mmap(0,0,0,0,0,0):EINVAL [ 0]: sys_mmap(0,0,0,0,0,0):EINVAL [ 36]: sys_mmap(0,0,0,0,0,0):EINVAL [ 24]: sys_mmap(0,0,0,0,0,0):EINVAL [ 36]: hugefileobj_create: obj: 0xffff8097b0f6dfa0, VA: 0x0, page array allocated for 14209 pages, pagesize: 2097152 [ 0]: hugefileobj_create: obj: 0xffff80003220e140, VA: 0x0, page array allocated for 14209 pages, pagesize: 2097152 [ 24]: hugefileobj_create: obj: 0xffff808fa8fbc020, VA: 0x0, page array allocated for 14209 pages, pagesize: 2097152 [ 12]: hugefileobj_create: obj: 0xffff8087aa43a8e0, VA: 0x0, page array allocated for 14209 pages, pagesize: 2097152 [ 0]: boot_param_size: 65536 [ 0]: %: GICv3 [ 0]: setup_arm64 done. IHK/McKernel started. [ 0]: ns_per_tsc: 10000 [ 0]: KCommand Line: hidos dump_level=24 allow_oversubscribe time_sharing [ 0]: Physical memory: 0xb0000000 - 0xd2000000, 570425344 bytes, 8704 pages available @ NUMA: 0 [ 0]: Physical memory: 0xe0000000 - 0xfa800000, 444596224 bytes, 6784 pages available @ NUMA: 0 [ 0]: Physical memory: 0xfec00000 - 0xffc00000, 16777216 bytes, 256 pages available @ NUMA: 0 [ 0]: Physical memory: 0x8000300000 - 0x8080000000, 2144337920 bytes, 32720 pages available @ NUMA: 0 [ 0]: Physical memory: 0x8100000000 - 0x811fc00000, 532676608 bytes, 8128 pages available @ NUMA: 0 [ 0]: Physical memory: 0x8120800000 - 0x81a0000000, 2139095040 bytes, 32640 pages available @ NUMA: 0 [ 0]: Physical memory: 0x81a0400000 - 0x81c0000000, 532676608 bytes, 8128 pages available @ NUMA: 0 [ 0]: Physical memory: 0x81c2800000 - 0x81ff800000, 1023410176 bytes, 15616 pages available @ NUMA: 0 [ 0]: Physical memory: 0x8828400000 - 0x8829800000, 20971520 bytes, 320 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8829c00000 - 0x882b000000, 20971520 bytes, 320 pages available @ NUMA: 1 [ 0]: Physical memory: 0x882b400000 - 0x882b800000, 4194304 bytes, 64 pages available @ NUMA: 1 [ 0]: Physical memory: 0x882bc00000 - 0x886e400000, 1115684864 bytes, 17024 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8880000000 - 0x88b1000000, 822083584 bytes, 12544 pages available @ NUMA: 1 [ 0]: Physical memory: 0x88b1400000 - 0x88d1000000, 532676608 bytes, 8128 pages available @ NUMA: 1 [ 0]: Physical memory: 0x88d1400000 - 0x88d9000000, 130023424 bytes, 1984 pages available @ NUMA: 1 [ 0]: Physical memory: 0x88d9400000 - 0x88fe800000, 624951296 bytes, 9536 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8900000000 - 0x891fc00000, 532676608 bytes, 8128 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8920800000 - 0x8953400000, 851443712 bytes, 12992 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8953800000 - 0x8957400000, 62914560 bytes, 960 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8957800000 - 0x8958000000, 8388608 bytes, 128 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8958800000 - 0x8973800000, 452984832 bytes, 6912 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8973c00000 - 0x8998400000, 612368384 bytes, 9344 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8998800000 - 0x89a0000000, 125829120 bytes, 1920 pages available @ NUMA: 1 [ 0]: Physical memory: 0x89a0400000 - 0x89c0000000, 532676608 bytes, 8128 pages available @ NUMA: 1 [ 0]: Physical memory: 0x89c2800000 - 0x89e1400000, 515899392 bytes, 7872 pages available @ NUMA: 1 [ 0]: Physical memory: 0x89e1800000 - 0x89ff800000, 503316480 bytes, 7680 pages available @ NUMA: 1 [ 0]: Physical memory: 0x9028000000 - 0x9029400000, 20971520 bytes, 320 pages available @ NUMA: 2 [ 0]: Physical memory: 0x9029800000 - 0x902b800000, 33554432 bytes, 512 pages available @ NUMA: 2 [ 0]: Physical memory: 0x902c800000 - 0x911fc00000, 4081057792 bytes, 62272 pages available @ NUMA: 2 [ 0]: Physical memory: 0x9120800000 - 0x916b400000, 1254096896 bytes, 19136 pages available @ NUMA: 2 [ 0]: Physical memory: 0x9180000000 - 0x9198c00000, 415236096 bytes, 6336 pages available @ NUMA: 2 [ 0]: Physical memory: 0x9199000000 - 0x91a0000000, 117440512 bytes, 1792 pages available @ NUMA: 2 [ 0]: Physical memory: 0x91a0400000 - 0x91c0000000, 532676608 bytes, 8128 pages available @ NUMA: 2 [ 0]: Physical memory: 0x91c2c00000 - 0x91e4000000, 557842432 bytes, 8512 pages available @ NUMA: 2 [ 0]: Physical memory: 0x91e4400000 - 0x91ef000000, 180355072 bytes, 2752 pages available @ NUMA: 2 [ 0]: Physical memory: 0x91ef400000 - 0x91ff800000, 272629760 bytes, 4160 pages available @ NUMA: 2 [ 0]: Physical memory: 0x9830000000 - 0x98fe400000, 3460300800 bytes, 52800 pages available @ NUMA: 3 [ 0]: Physical memory: 0x98fec00000 - 0x995fc00000, 1627389952 bytes, 24832 pages available @ NUMA: 3 [ 0]: Physical memory: 0x9960000000 - 0x9971c00000, 297795584 bytes, 4544 pages available @ NUMA: 3 [ 0]: Physical memory: 0x9980400000 - 0x99a0000000, 532676608 bytes, 8128 pages available @ NUMA: 3 [ 0]: Physical memory: 0x99a2c00000 - 0x99de400000, 998244352 bytes, 15232 pages available @ NUMA: 3 [ 0]: Physical memory: 0x99de800000 - 0x99fe400000, 532676608 bytes, 8128 pages available @ NUMA: 3 [ 0]: Physical memory: 0x99fe800000 - 0x99ff000000, 8388608 bytes, 128 pages available @ NUMA: 3 [ 0]: NUMA: 0, Linux NUMA: 4, type: 1, available bytes: 7403995136, pages: 112976 [ 0]: NUMA: 1, Linux NUMA: 5, type: 1, available bytes: 7470055424, pages: 113984 [ 0]: NUMA: 2, Linux NUMA: 6, type: 1, available bytes: 7465861120, pages: 113920 [ 0]: NUMA: 3, Linux NUMA: 7, type: 1, available bytes: 7457472512, pages: 113792 [ 0]: NUMA 0 distances: 0 (10), 1 (20), 2 (30), 3 (30), [ 0]: NUMA 1 distances: 1 (10), 0 (20), 2 (30), 3 (30), [ 0]: NUMA 2 distances: 2 (10), 3 (20), 0 (30), 1 (30), [ 0]: NUMA 3 distances: 3 (10), 2 (20), 0 (30), 1 (30), [ 0]: Trampoline area: 0x0 [ 0]: # of cpus : 48 [ 0]: locals = ffff800030080000 [ 0]: BSP: 0 (HW ID: 12 @ NUMA 0) [ 0]: SVE: maximum available vector length 64 bytes per vector [ 0]: SVE: default vector length 64 bytes per vector [ 0]: BSP: booted 47 AP CPUs [ 0]: Master channel init acked. [ 0]: Using Linux work IRQ for IKC IPI. [ 0]: Enable Host mapping vDSO. [ 0]: tof_utofu_init_globals: linux_vmalloc_start: ffff000010000000 [ 0]: Tofu globals initialized. IHK/McKernel booted. [ 12]: sys_mmap(0,0,0,0,0,0):EINVAL [ 0]: sys_mmap(0,0,0,0,0,0):EINVAL [ 36]: sys_mmap(0,0,0,0,0,0):EINVAL [ 24]: sys_mmap(0,0,0,0,0,0):EINVAL [ 36]: hugefileobj_create: obj: 0xffff8097b0f6dfa0, VA: 0x0, page array allocated for 14209 pages, pagesize: 2097152 [ 0]: hugefileobj_create: obj: 0xffff80003220e140, VA: 0x0, page array allocated for 14209 pages, pagesize: 2097152 [ 24]: hugefileobj_create: obj: 0xffff808fa8fbc020, VA: 0x0, page array allocated for 14209 pages, pagesize: 2097152 [ 12]: hugefileobj_create: obj: 0xffff8087aa43a8e0, VA: 0x0, page array allocated for 14209 pages, pagesize: 2097152 [ 24]: page_fault_handler: PF in PF?? [ 24]: PANIC [ 0]: page_fault_handler: PF in PF?? [ 0]: PANIC [ 36]: page_fault_handler: PF in PF?? [ 36]: PANIC [ 0]: CPU0: shutdown. [ 1]: CPU1: shutdown. [ 2]: CPU2: shutdown. [ 3]: CPU3: shutdown. [ 4]: CPU4: shutdown. [ 5]: CPU5: shutdown. [ 6]: CPU6: shutdown. [ 7]: CPU7: shutdown. [ 8]: CPU8: shutdown. [ 9]: CPU9: shutdown. [ 10]: CPU10: shutdown. [ 11]: CPU11: shutdown. [ 12]: CPU12: shutdown. [ 13]: CPU13: shutdown. [ 14]: CPU14: shutdown. [ 15]: CPU15: shutdown. [ 16]: CPU16: shutdown. [ 0]: boot_param_size: 65536 [ 0]: %: GICv3 [ 0]: setup_arm64 done. IHK/McKernel started. [ 0]: ns_per_tsc: 10000 [ 0]: KCommand Line: hidos dump_level=24 allow_oversubscribe time_sharing [ 0]: Physical memory: 0xb0000000 - 0xd2000000, 570425344 bytes, 8704 pages available @ NUMA: 0 [ 0]: Physical memory: 0xe0000000 - 0xfa800000, 444596224 bytes, 6784 pages available @ NUMA: 0 [ 0]: Physical memory: 0xfec00000 - 0xffc00000, 16777216 bytes, 256 pages available @ NUMA: 0 [ 0]: Physical memory: 0x8000300000 - 0x8080000000, 2144337920 bytes, 32720 pages available @ NUMA: 0 [ 0]: Physical memory: 0x8100000000 - 0x811fc00000, 532676608 bytes, 8128 pages available @ NUMA: 0 [ 0]: Physical memory: 0x8120800000 - 0x81a0000000, 2139095040 bytes, 32640 pages available @ NUMA: 0 [ 0]: Physical memory: 0x81a0400000 - 0x81c0000000, 532676608 bytes, 8128 pages available @ NUMA: 0 [ 0]: Physical memory: 0x81c2800000 - 0x81ff800000, 1023410176 bytes, 15616 pages available @ NUMA: 0 [ 0]: Physical memory: 0x8828400000 - 0x8829800000, 20971520 bytes, 320 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8829c00000 - 0x882b000000, 20971520 bytes, 320 pages available @ NUMA: 1 [ 0]: Physical memory: 0x882b400000 - 0x882b800000, 4194304 bytes, 64 pages available @ NUMA: 1 [ 0]: Physical memory: 0x882bc00000 - 0x886e400000, 1115684864 bytes, 17024 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8880000000 - 0x88b1000000, 822083584 bytes, 12544 pages available @ NUMA: 1 [ 0]: Physical memory: 0x88b1400000 - 0x88d1000000, 532676608 bytes, 8128 pages available @ NUMA: 1 [ 0]: Physical memory: 0x88d1400000 - 0x88d9000000, 130023424 bytes, 1984 pages available @ NUMA: 1 [ 0]: Physical memory: 0x88d9400000 - 0x88fe800000, 624951296 bytes, 9536 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8900000000 - 0x891fc00000, 532676608 bytes, 8128 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8920800000 - 0x8953400000, 851443712 bytes, 12992 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8953800000 - 0x8957400000, 62914560 bytes, 960 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8957800000 - 0x8958000000, 8388608 bytes, 128 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8958800000 - 0x8973800000, 452984832 bytes, 6912 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8973c00000 - 0x8998400000, 612368384 bytes, 9344 pages available @ NUMA: 1 [ 0]: Physical memory: 0x8998800000 - 0x89a0000000, 125829120 bytes, 1920 pages available @ NUMA: 1 [ 0]: Physical memory: 0x89a0400000 - 0x89c0000000, 532676608 bytes, 8128 pages available @ NUMA: 1 [ 0]: Physical memory: 0x89c2800000 - 0x89e1400000, 515899392 bytes, 7872 pages available @ NUMA: 1 [ 0]: Physical memory: 0x89e1800000 - 0x89ff800000, 503316480 bytes, 7680 pages available @ NUMA: 1 [ 0]: Physical memory: 0x9028000000 - 0x9029400000, 20971520 bytes, 320 pages available @ NUMA: 2 [ 0]: Physical memory: 0x9029800000 - 0x902b800000, 33554432 bytes, 512 pages available @ NUMA: 2 [ 0]: Physical memory: 0x902c800000 - 0x911fc00000, 4081057792 bytes, 62272 pages available @ NUMA: 2 [ 0]: Physical memory: 0x9120800000 - 0x916b400000, 1254096896 bytes, 19136 pages available @ NUMA: 2 [ 0]: Physical memory: 0x9180000000 - 0x9198c00000, 415236096 bytes, 6336 pages available @ NUMA: 2 [ 0]: Physical memory: 0x9199000000 - 0x91a0000000, 117440512 bytes, 1792 pages available @ NUMA: 2 [ 0]: Physical memory: 0x91a0400000 - 0x91c0000000, 532676608 bytes, 8128 pages available @ NUMA: 2 [ 0]: Physical memory: 0x91c2c00000 - 0x91e4000000, 557842432 bytes, 8512 pages available @ NUMA: 2 [ 0]: Physical memory: 0x91e4400000 - 0x91ef000000, 180355072 bytes, 2752 pages available @ NUMA: 2 [ 0]: Physical memory: 0x91ef400000 - 0x91ff800000, 272629760 bytes, 4160 pages available @ NUMA: 2 [ 0]: Physical memory: 0x9830000000 - 0x98fe400000, 3460300800 bytes, 52800 pages available @ NUMA: 3 [ 0]: Physical memory: 0x98fec00000 - 0x995fc00000, 1627389952 bytes, 24832 pages available @ NUMA: 3 [ 0]: Physical memory: 0x9960000000 - 0x9971c00000, 297795584 bytes, 4544 pages available @ NUMA: 3 [ 0]: Physical memory: 0x9980400000 - 0x99a0000000, 532676608 bytes, 8128 pages available @ NUMA: 3 [ 0]: Physical memory: 0x99a2c00000 - 0x99de400000, 998244352 bytes, 15232 pages available @ NUMA: 3 [ 0]: Physical memory: 0x99de800000 - 0x99fe400000, 532676608 bytes, 8128 pages available @ NUMA: 3 [ 0]: Physical memory: 0x99fe800000 - 0x99ff000000, 8388608 bytes, 128 pages available @ NUMA: 3 [ 0]: NUMA: 0, Linux NUMA: 4, type: 1, available bytes: 7403995136, pages: 112976 [ 0]: NUMA: 1, Linux NUMA: 5, type: 1, available bytes: 7470055424, pages: 113984 [ 0]: NUMA: 2, Linux NUMA: 6, type: 1, available bytes: 7465861120, pages: 113920 [ 0]: NUMA: 3, Linux NUMA: 7, type: 1, available bytes: 7457472512, pages: 113792 [ 0]: NUMA 0 distances: 0 (10), 1 (20), 2 (30), 3 (30), [ 0]: NUMA 1 distances: 1 (10), 0 (20), 2 (30), 3 (30), [ 0]: NUMA 2 distances: 2 (10), 3 (20), 0 (30), 1 (30), [ 0]: NUMA 3 distances: 3 (10), 2 (20), 0 (30), 1 (30), [ 0]: Trampoline area: 0x0 [ 0]: # of cpus : 48 [ 0]: locals = ffff800030080000 [ 0]: BSP: 0 (HW ID: 12 @ NUMA 0) [ 0]: SVE: maximum available vector length 64 bytes per vector [ 0]: SVE: default vector length 64 bytes per vector [ 0]: BSP: booted 47 AP CPUs [ 0]: Master channel init acked. [ 0]: Using Linux work IRQ for IKC IPI. [ 0]: Enable Host mapping vDSO. [ 0]: tof_utofu_init_globals: linux_vmalloc_start: ffff000010000000 [ 0]: Tofu globals initialized. IHK/McKernel booted. [ 36]: sys_mmap(0,0,0,0,0,0):EINVAL [ 24]: sys_mmap(0,0,0,0,0,0):EINVAL [ 12]: sys_mmap(0,0,0,0,0,0):EINVAL [ 0]: sys_mmap(0,0,0,0,0,0):EINVAL [ 24]: hugefileobj_create: obj: 0xffff808fa8fbb540, VA: 0x0, page array allocated for 14209 pages, pagesize: 2097152 [ 36]: hugefileobj_create: obj: 0xffff8097b104b6c0, VA: 0x0, page array allocated for 14209 pages, pagesize: 2097152 [ 0]: hugefileobj_create: obj: 0xffff800031e5b200, VA: 0x0, page array allocated for 14209 pages, pagesize: 2097152 [ 12]: hugefileobj_create: obj: 0xffff8087aa40f840, VA: 0x0, page array allocated for 14209 pages, pagesize: 2097152 [ 0]: rusage_check_oom: memory used:29795287040 available:29797384192 [ 0]: rusage_check_oom: memory used:29795287040 available:29797384192 [ 0]: rusage_check_oom: memory used:29795287040 available:29797384192 [ 0]: rusage_check_oom: memory used:29795287040 available:29797384192 [ 0]: rusage_check_oom: memory used:29795287040 available:29797384192 [ 0]: rusage_check_oom: memory used:29795287040 available:29797384192 [ 0]: rusage_check_oom: memory used:29795287040 available:29797384192 [ 0]: rusage_check_oom: memory used:29795287040 available:29797384192 [ 0]: rusage_check_oom: memory used:29795287040 available:29797384192 [ 0]: rusage_check_oom: memory used:29795287040 available:29797384192 [ 0]: rusage_check_oom: memory used:29795287040 available:29797384192 [ 0]: rusage_check_oom: memory used:29795287040 available:29797384192 [ 0]: rusage_check_oom: memory used:29795287040 available:29797384192 [ 0]: rusage_check_oom: memory used:29795287040 available:29797384192 [ 0]: rusage_check_oom: memory used:29795287040 available:29797384192 [ 0]: rusage_check_oom: memory used:29795287040 available:29797384192 [ 0]: rusage_check_oom: memory used:29795287040 available:29797384192 [ 0]: rusage_check_oom: memory used:29795287040 available:29797384192 [ 12]: page_fault_handler: PF in PF?? [ 0]: rusage_check_oom: memory used:29795287040 available:29797384192 [ 12]: PANIC [ 0]: rusage_check_oom: memory used:29795287040 available:29797384192 [ 0]: rusage_check_oom: memory used:29795287040 available:29797384192 [ 0]: rusage_check_oom: memory used:29795287040 available:29797384192 [ 0]: rusage_check_oom: memory used:29795287040 available:29797384192 [ 0]: rusage_check_oom: memory used:29795287040 available:29797384192 [ 0]: page_fault_process_memory_range(ffff800030dbbd20,100a4fc00000-100a4fe00000 733004,100a4fc00000,40000006):cannot allocate new page. -12 [ 0]: page_fault_handler fault VM failed for TID: 69, addr: 0x100a4fc00000, reason: 6, error: -12 [ 0]: Page fault for 0x100a4fc00000 [ 0]: no page found for write access in user mode (reserved bit wasn't set), it wasn't an instruction fetch [ 0]: address is in range, flag: 0x733004 ( ) [ 0]: ihk_mc_pt_print_pte: 0x100a4fc00000, CONFIG_ARM64_PGTABLE_LEVELS: 3, ptl4_index: 0, ptl3_index: 4, ptl2_index: 82, ptl1_index: 4032 [ 0]: l4 table: 0xB1460000 l4idx: 0 [ 0]: l4 entry: 0x3 [ 0]: l3 table: 0xB1560000 l3idx: 4 [ 0]: l3 entry: 0xB1560003 [ 0]: l2 table: 0x80003E0000 l2idx: 82 [ 0]: l2 entry: 0x80003E0003 [ 0]: l1 table: 0x0 l1idx: 4032 [ 0]: 0x100A4FC00000 l1idx not present! [ 0]: l1 entry: 0x0 [ 0]: PC: 0x100000d25018 (15018 in /opt/FJSVxos/mmm/lib64/libmpg.so.1) [ 0]: dump pt_regs: [ 0]: x0 : 0000000000200000 x1 : 0000100a4fc00000 x2 : 0000000000000003 x3 : 000000000004c022 [ 0]: x4 : ffffffffffffffff x5 : 0000000000000000 x6 : 000010000104d2f0 x7 : 0000000000000001 [ 0]: x8 : 00000000000000de x9 : 0000000000000083 x10 : 000000000000008e x11 : 00003ffffffff840 [ 0]: x12 : 0000000000000000 x13 : 0000100000d11470 x14 : 0000000000000000 x15 : 1e042f3a1c140200 [ 0]: x16 : 0000100000d44048 x17 : 0000100000d24970 x18 : 0000100000658850 x19 : 000000000004c022 [ 0]: x20 : 0000100a4fc00000 x21 : 00000000ffffffff x22 : 0000000000000003 x23 : 0000000000200000 [ 0]: x24 : 0000100000d45000 x25 : 0000100a4fe00000 x26 : 0000000000000000 x27 : 0000000000000000 [ 0]: x28 : 0000100a4f900010 x29 : 00003ffffffff650 x30 : 0000100000d2500c [ 0]: sp : 00003ffffffff650 [ 0]: pc : 0000100000d25018 [ 0]: pstate : 0000000080000000(N:1 Z:0 C:0 V:0 SS:0 IL:0 D:0 A:0 I:0 F:0 M[4]:0 M:0) [ 0]: orig_x0 : 0000000000000000 [ 0]: syscallno : ffffffffffffffff [ 0]: rusage_check_oom: memory used:29795287040 available:29797384192 [ 0]: rusage_check_oom: memory used:29795287040 available:29797384192 [ 0]: page_fault_handler: PF in PF?? [ 0]: PANIC [ 0]: CPU0: shutdown. [ 1]: CPU1: shutdown. [ 2]: CPU2: shutdown. [ 3]: CPU3: shutdown. [ 4]: CPU4: shutdown. [ 0]: CPU0: shutdown. [ 1]: CPU1: shutdown. [ 2]: CPU2: shutdown. [ 3]: CPU3: shutdown. [ 4]: CPU4: shutdown. [ 5]: CPU5: shutdown. [ 6]: CPU6: shutdown. [ 7]: CPU7: shutdown. [ 8]: CPU8: shutdown. [ 9]: CPU9: shutdown. [ 10]: CPU10: shutdown. [ 11]: CPU11: shutdown. [ 12]: CPU12: shutdown. [ 13]: CPU13: shutdown. [ 14]: CPU14: shutdown. [ 15]: CPU15: shutdown. [ 16]: CPU16: shutdown. [ 17]: CPU17: shutdown. [ 18]: CPU18: shutdown. [ 19]: CPU19: shutdown. [ 20]: CPU20: shutdown. [ 21]: CPU21: shutdown. [ 22]: CPU22: shutdown. [ 23]: CPU23: shutdown. [ 24]: CPU24: shutdown. [ 25]: CPU25: shutdown. [ 26]: CPU26: shutdown. [ 27]: CPU27: shutdown. [ 28]: CPU28: shutdown. [ 29]: CPU29: shutdown. [ 30]: CPU30: shutdown. [ 31]: CPU31: shutdown. [ 32]: CPU32: shutdown. [ 33]: CPU33: shutdown. [ 34]: CPU34: shutdown. [ 35]: CPU35: shutdown. [ 36]: CPU36: shutdown. [ 37]: CPU37: shutdown. [ 38]: CPU38: shutdown.
上記のnoteの回避策を更新いたしましたのでご確認をよろしくお願いいたします。
優先度高
h2. 症状1
McKernelのジョブ実行中に、ノードの死活監視で落とされた。 ・発生時刻 ・13:26:11 ~ 13:31:19 ・ジョブ ・/fefs1/images/mckernel.imgをUDI指定 ・ジョブスクリプト:cf2_peta_lpf_mck.sh ・ジョブ実行環境(jobenv):mck6 ・8ノードのMPIジョブ (1ノードあたり4プロセス) ・rankごとにmalloc/memset/freeするだけのジョブ。ジョブ一式を後程提供する予定。 ・ログ /var/log/messageを添付します。
・富士通側の見解 直前にihkmondがOOpsを出している。 Jan 21 13:23:51 system8-cn0202 ihkmond[3297]: [ 38]: Unable to handle kernel paging request at virtual address 00100469 Jan 21 13:23:51 system8-cn0202 ihkmond[3297]: [ 38]: OOps.
McKernelが起動済(Running)の状態で何かハングが発生したのだと推測。 McKernelがpanic?した延長でホストLinuxも巻き込んでハングしてしまったのではないかと推測。
h2. 症状2
ダンプ採取ができていない。
h2. 原因2
以下2つの可能性がある。 A. #1601と同件 B. カーネルが外からのpanic割り込みを受け付ける状況ではない。(e.g.アシスタントコアが割り込み禁止でハングする)
h2. 解決策/回避策2
A. cmakeに@-DENABLE_PANIC_NOTIFIER_WORKAROUND=ON@をつけることで、当該構造体を用いないようにする。この場合、LinuxのダンプからMcKernelのメモリを除外する。
修正は1.7.9以降に入っている。関連コミットは以下の通り。
B. smp_ihk_os_panic_notifierでMcKernel側の処理を待つ部分でタイムアウトを入れる。
修正は1.7.9以降に入っている。関連コミットは以下の通り。