ProjectMitosisOS / mitosis-core

An OS kernel module for fast **remote** fork using advanced datacenter networking (RDMA).
MIT License
57 stars 5 forks source link

Failed when run ./exp/connector in child machine, which cannot return #5

Open LiuMicheal opened 2 months ago

LiuMicheal commented 2 months ago

I loaded fork.ko successfully in both parent and child machine: [15379.285688] src/lib.rs@29: [INFO ] - Remote fork kernel module assigned ID=0 [15379.285692] /root/mitosis-core/mitosis/src/startup.rs@49: [INFO ] - Try to start MITOSIS instance, init global services [15379.285694] /root/mitosis-core/mitosis/src/startup.rs@15: [INFO ] - [check]: use on-demand resume mode. [15379.285695] /root/mitosis-core/mitosis/src/startup.rs@19: [INFO ] - [check]: Parent is using copy-on-write (COW) mode. [15379.285696] /root/mitosis-core/mitosis/src/startup.rs@30: [INFO ] - [check]: Disable prefetching. [15379.285697] /root/mitosis-core/mitosis/src/startup.rs@36: [INFO ] - [check]: Not cache remote page table. [15379.285698] /root/mitosis-core/mitosis/src/startup.rs@40: [INFO ] - [check]: Use RDMA's reliable connection for communications. [15379.285699] /root/mitosis-core/mitosis/src/startup.rs@45: [INFO ] - All configuration check passes ! [15379.286407] rust-kernel-rdma-base: enabling unsafe global rkey [15379.293999] /root/mitosis-core/mitosis/src/startup.rs@61: [INFO ] - Initialize RDMA context done [15379.294029] /root/mitosis-core/mitosis/src/dc_pool.rs@96: [INFO ] - Start initializing client-side DCQP pool. [15379.448694] /root/mitosis-core/mitosis/src/dc_pool.rs@144: [INFO ] - Start initializing server-side DCTarget pool. [15379.503761] /root/mitosis-core/mitosis/src/startup.rs@187: [INFO ] - RPC service initializes done [15379.503803] /root/mitosis-core/mitosis/src/rpc_service.rs@225: [INFO ] - MITOSIS RPC thread 0 started, listing on gid: fe80:0:0:0:506b:4b03:d4:3a74 [15379.503867] /root/mitosis-core/mitosis/src/rpc_service.rs@225: [INFO ] - MITOSIS RPC thread 1 started, listing on gid: fe80:0:0:0:506b:4b03:d4:3a74 [15379.667418] /root/mitosis-core/mitosis/src/startup.rs@198: [INFO ] - Start waiting for the RPC servers to start... [15379.667422] /root/mitosis-core/mitosis/src/startup.rs@200: [INFO ] - All RPC thread handlers initialized! [15379.667426] /root/mitosis-core/mitosis/src/startup.rs@211: [INFO ] - All initialization done, takes 381 ms

However, when I run ./connector in child machine: ./connector -gid="fe80:0000:0000:0000:506b:4b03:00d4:3a74" -mac_id=0 -nic_id=0 It cannot return.

When I use dmesg, I get: image

After that, the following error was reported repeatedly: [42472.244211] watchdog: BUG: soft lockup - CPU#60 stuck for 22s! [connector:6945] [42472.244213] Modules linked in: fork(OE) xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink nfnetlink xfrm_user iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c xt_addrtype br_netfilter ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter aufs overlay rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) esp6_offload esp6 esp4_offload esp4 xfrm_algo mlx5_ib(OE) ib_uverbs(OE) mlx5_core(OE) mlxfw(OE) mlx4_en(OE) mlx4_ib(OE) ib_core(OE) mlx4_core(OE) mlx_compat(OE) devlink bridge stp llc nls_iso8859_1 binfmt_misc joydev kvm_intel kvm irqbypass input_leds ast crct10dif_pclmul crc32_pclmul ghash_clmulni_intel drm_kms_helper ttm pcbc drm fb_sys_fops syscopyarea aesni_intel aes_x86_64 ipmi_ssif sysfillrect [42472.244236] sysimgblt crypto_simd ipmi_si glue_helper ipmi_devintf cryptd ipmi_msghandler sch_fq_codel knem(OE) parport_pc ppdev lp parport ip_tables x_tables autofs4 hid_generic usbhid hid mpt3sas igb raid_class i2c_algo_bit scsi_transport_sas dca ptp ahci pps_core libahci [last unloaded: mlx_compat] [42472.244247] CPU: 60 PID: 6945 Comm: connector Tainted: G OEL 4.15.0-46-generic #49-Ubuntu [42472.244247] Hardware name: Sugon CB80-G30/80P48B, BIOS 0CBSA014 05/21/2018 [42472.244312] RIP: 0010:ZN92$LT$os_network..datagram..ud_receiver..UDReceiver$u20$as$u20$os_network..future..Future$GT$4poll17h71b7c3ebe7d5a944E+0x5e/0x1a0 [fork] [42472.244313] RSP: 0018:ffffb5270de8f070 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff11 [42472.244315] RAX: 0000000000000000 RBX: ffff8fb85b8be5a0 RCX: 000000000000000f [42472.244315] RDX: 0000000000000000 RSI: 0000000000000246 RDI: 0000000000000246 [42472.244316] RBP: ffffb5270de8f0e0 R08: 0000000000000fff R09: 0000000000000000 [42472.244316] R10: fefefefefefefe00 R11: 0000000000000001 R12: 0000000000000000 [42472.244317] R13: ffffb5270de8f6a8 R14: ffffb5270de8f0f0 R15: ffff8fb8568c69c0 [42472.244318] FS: 00007f0c7b2e8740(0000) GS:ffff8fb860e80000(0000) knlGS:0000000000000000 [42472.244319] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [42472.244319] CR2: 00007fd4b66881d8 CR3: 000000176fcae001 CR4: 00000000007606e0 [42472.244320] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [42472.244321] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [42472.244321] PKRU: 55555554 [42472.244321] Call Trace: [42472.244374] ZN84$LT$os_network..rpc..Caller$LT$R$C$SS$GT$$u20$as$u20$os_network..future..Future$GT$4poll17hbc37754fdbb6d676E+0x1b/0x300 [fork] [42472.244425] ? ZN109$LT$linux_kernel_module..mutex..LinuxMutex$LT$T$GT$$u20$as$u20$linux_kernel_module..sync..Mutex$LT$T$GT$$GT$6lock_f17h34+0x36c/0x570 [fork] [42472.244476] ZN109$LT$linux_kernel_module..mutex..LinuxMutex$LT$T$GT$$u20$as$u20$linux_kernel_module..sync..Mutex$LT$T$GT$$GT$6lock_f17h34+0x34b/0x570 [fork] [42472.244478] ? switch_to_asm+0x40/0x70 [42472.244480] ? switch_to_asm+0x34/0x70 [42472.244482] ? switch_to+0x13a/0x500 [42472.244483] ? switch_to_asm+0x40/0x70 [42472.244484] ? switch_to_asm+0x34/0x70 [42472.244486] ? lock_timer_base+0x6b/0x90 [42472.244489] ? backport_xas_init_marks+0x29/0x50 [mlx_compat] [42472.244491] ? backport_xas_store+0x454/0x540 [mlx_compat] [42472.244492] ? switch_to_asm+0x40/0x70 [42472.244493] ? switch_to_asm+0x34/0x70 [42472.244495] ? switch_to_asm+0x40/0x70 [42472.244496] ? switch_to_asm+0x34/0x70 [42472.244497] ? switch_to_asm+0x40/0x70 [42472.244499] ? switch_to_asm+0x34/0x70 [42472.244500] ? switch_to_asm+0x40/0x70 [42472.244501] ? switch_to_asm+0x34/0x70 [42472.244502] ? switch_to_asm+0x40/0x70 [42472.244504] ? switch_to_asm+0x34/0x70 [42472.244505] ? switch_to_asm+0x40/0x70 [42472.244506] ? switch_to_asm+0x34/0x70 [42472.244508] ? switch_to_asm+0x40/0x70 [42472.244509] ? switch_to_asm+0x34/0x70 [42472.244510] ? switch_to_asm+0x40/0x70 [42472.244511] ? switch_to_asm+0x34/0x70 [42472.244513] ? switch_to_asm+0x40/0x70 [42472.244514] ? switch_to_asm+0x34/0x70 [42472.244515] ? switch_to_asm+0x40/0x70 [42472.244517] ? switch_to_asm+0x34/0x70 [42472.244518] ? switch_to_asm+0x40/0x70 [42472.244519] ? switch_to_asm+0x34/0x70 [42472.244521] ? switch_to_asm+0x40/0x70 [42472.244522] ? switch_to_asm+0x34/0x70 [42472.244523] ? switch_to+0x13a/0x500 [42472.244524] ? switch_to_asm+0x40/0x70 [42472.244526] ? switch_to_asm+0x34/0x70 [42472.244527] ? switch_to_asm+0x40/0x70 [42472.244529] ? lock_timer_base+0x6b/0x90 [42472.244530] ? try_to_del_timer_sync+0x53/0x80 [42472.244532] ? del_timer_sync+0x39/0x40 [42472.244533] ? schedule_timeout+0x165/0x350 [42472.244535] ? __next_timer_interrupt+0xe0/0xe0 [42472.244546] ? mlx5_free_cmd_msg+0x4e/0x60 [mlx5_core] [42472.244553] ? free_msg+0x54/0x60 [mlx5_core] [42472.244560] ? cmd_exec+0x638/0xba0 [mlx5_core] [42472.244567] ? mlx5_cmd_exec+0x37/0x50 [mlx5_core] [42472.244572] ? mlx5_ib_add_gid+0x160/0x160 [mlx5_ib] [42472.244631] ? _ZN8KRdmaKit7context7Context9query_gid17h1c02f1deda791332E+0x45/0xa0 [fork] [42472.244680] _ZN7mitosis15rpc_caller_pool10CallerPool18connect_session_at17h319b3d463ca7ecf1E+0x307/0x440 [fork] [42472.244733] _ZN7mitosis7startup20probe_remote_rpc_end17h68e889abb6fb1670E+0x105/0x1e0 [fork] [42472.244779] _ZN19linux_kernel_module15file_operations15ioctrl_callback17h39a258ccaee7971bE+0x85d/0xc80 [fork] [42472.244781] ? handle_mm_fault+0x478/0x5c0 [42472.244783] ? _cond_resched+0x19/0x40 [42472.244784] ? get_page_from_freelist+0xf16/0x1400 [42472.244786] ? __follow_mount_rcu.isra.26+0x6e/0xf0 [42472.244787] ? lookup_fast+0xcc/0x320 [42472.244789] ? __handle_mm_fault+0x478/0x5c0 [42472.244791] ? _cond_resched+0x19/0x40 [42472.244792] ? get_page_from_freelist+0xf16/0x1400 [42472.244793] ? lookup_fast+0xcc/0x320 [42472.244795] ? chrdev_open+0xc4/0x1b0 [42472.244796] ? dput.part.23+0xba/0x1e0 [42472.244797] ? mntput+0x24/0x40 [42472.244798] ? terminate_walk+0x8e/0xf0 [42472.244799] ? path_openat+0x600/0x1770 [42472.244801] ? filemap_map_pages+0x36c/0x390 [42472.244803] ? do_filp_open+0xaf/0x110 [42472.244804] do_vfs_ioctl+0xa8/0x630 [42472.244806] ? putname+0x4c/0x60 [42472.244807] ? do_sys_open+0x13e/0x2c0 [42472.244809] SyS_ioctl+0x79/0x90 [42472.244810] do_syscall_64+0x73/0x130 [42472.244812] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [42472.244813] RIP: 0033:0x7f0c7b9a9217 [42472.244814] RSP: 002b:00007ffd7c001cb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [42472.244815] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f0c7b9a9217 [42472.244816] RDX: 00007ffd7c001ce0 RSI: 0000000000000003 RDI: 0000000000000003 [42472.244816] RBP: 00007ffd7c001d00 R08: 0000000000000006 R09: 0000000000000000 [42472.244817] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [42472.244817] R13: 00007ffd7c001e20 R14: 0000000000000000 R15: 0000000000000000 [42472.244818] Code: 43 30 48 8b 78 18 48 8d 55 90 be 01 00 00 00 ff 50 20 85 c0 0f 88 9a 00 00 00 41 89 c4 83 f8 02 0f 83 0c 01 00 00 f0 48 83 2b 01 <75> 09 48 8d 7d d8 e8 07 0d 00 00 45 85 e4 0f 84 9c 00 00 00 49

When I use top command in child machine, I get: image

Here is my dual-port IB CX4 card: (base) root@liu-9:~# ibstat CA 'mlx5_0' CA type: MT4115 Number of ports: 1 Firmware version: 12.28.2006 Hardware version: 0 Node GUID: 0x506b4b0300d43a74 System image GUID: 0x506b4b0300d43a74 Port 1: State: Active Physical state: LinkUp Rate: 100 Base lid: 3 LMC: 0 SM lid: 3 Capability mask: 0x2651e84a Port GUID: 0x506b4b0300d43a74 Link layer: InfiniBand CA 'mlx5_1' CA type: MT4115 Number of ports: 1 Firmware version: 12.28.2006 Hardware version: 0 Node GUID: 0x506b4b0300d43a75 System image GUID: 0x506b4b0300d43a74 Port 1: State: Active Physical state: LinkUp Rate: 100 Base lid: 15 LMC: 0 SM lid: 11 Capability mask: 0x2651e848 Port GUID: 0x506b4b0300d43a75 Link layer: InfiniBand

(base) root@liu-9:~# show_gids DEV PORT INDEX GID IPv4 VER DEV


mlx5_0 1 0 fe80:0000:0000:0000:506b:4b03:00d4:3a74 v1
mlx5_1 1 0 fe80:0000:0000:0000:506b:4b03:00d4:3a75 v1
n_gids_found=2

And the connection between two IB CX4 card is healthy.

My environment: Ubuntu 18.04 4.15.0-46-generic MLNX_OFED_LINUX-4.9-3.1.5.0

Could you please provide some suggestions on how I can further search for a solution? I will do my best to explore further. Thanks.

LiuMicheal commented 1 month ago

I solved the above error I encountered: 1) When the above error occurred, my two RDMA NICs were directly connected with IB cables. I changed to connect through an IB switch. After that, the "Failed to create rc connection" error was solved. It seems that the subnet manager on the IB switch solved this error. 2) By modifying grub and turning off IOMMU, the two DMAR errors were solved. image I am interested in remote fork when IOMMU is enabled. I will explore and try to complete this part of the fix : )

wxdwfc commented 1 month ago

Cool! Thanks for your digging, we currently cannot support IOMMU due to the kernel space direct physical memory access.