erpc-io / eRPC

Efficient RPCs for datacenter networks
https://erpc.io/
Other
835 stars 137 forks source link

Running Hello_world app failed on Cloudlab #92

Closed lyuxiaosu closed 1 year ago

lyuxiaosu commented 1 year ago

Hi Anuj,

Thanks for the great work! I am trying to use ePRC on Cloudlab with dpdk, but I encoutered an error when running any apps, the error message is:

xiaosuGW@node0:~/eRPC/hello_world$ sudo ./server 69:692838 WARNG: eRPC Nexus: Testing enabled. Perf will be low. 70:211907 WARNG: Running as primary DPDK process. eRPC DPDK daemon is not running. terminate called after throwing an instance of 'std::runtime_error' what(): Port 0 is down. Aborted I followed the steps of Running eRPC over DPDK on Microsoft Azure VMs to build everyting on Cloudlab machine, but the default Cloudlab machine has two port, one is for public which can be connected by SSH, the other is for internal, which IP is 10.10.1.*, so I didn't create eth1, but just modified eRPC code to replace eth1 with the internal port name ens1f1 and then rebuild. But when I run the test apps, it failed with the above errors. Did I do something wrong or miss something?

NIC model Mellanox Technologies MT27710 Family [ConnectX-4 Lx]

rdma_core version and DPDK version Latest rdma_core from github and DPDK 19.11.5

Operating system Ubuntu 18.04

Thanks for your help!

lyuxiaosu commented 1 year ago

It seems in my machine, the active DPDK port index is not 0, I run dpdk-devbindto check the index and then pass the right index to create erpc::Rpc and the problem is solved. But both client and server were stuck and finaly, client side gave the following info:

`xiaosuGW@node1:~/eRPC/hello_world$ sudo ./client

21:389760 WARNG: eRPC Nexus: Testing enabled. Perf will be low. 21:969580 WARNG: Running as primary DPDK process. eRPC DPDK daemon is not running. 22:336090 WARNG: DpdkTransport created for Rpc ID 0, queue 0, datapath UDP port 10000 51:068527 WARNG: Rpc 0 stuck in rte_eth_tx_burst79:768920 WARNG: Rpc 0 stuck in rte_eth_tx_burst`

It seems the packet is not sent out in the client side.

Server side finally exits and gave the following error:

`xiaosuGW@node0:~/eRPC/hello_world$ sudo ./server

83:304372 WARNG: eRPC Nexus: Testing enabled. Perf will be low. 83:804786 WARNG: Running as primary DPDK process. eRPC DPDK daemon is not running. 84:142492 WARNG: DpdkTransport created for Rpc ID 0, queue 0, datapath UDP port 10000 84:140188 ERROR: eRPC Nexus: Nexus destroyed while an Rpc object depending on this Nexus is still alive.server: /users/xiaosuGW/eRPC/src/nexus_impl/nexus.cc:100: erpc::Nexus::~Nexus(): Assertion false' failed. Aborted Is the DPDK not working on the port in client side? Did I miss some steps? I saw there is a cloudlab.sh under scripts folder, should I run it to bind DPDK to port? The NIC is Mellanox not Intel, which doesn't useigb_uio, right? When I run the script, it gave the following error:

`xiaosuGW@node1:~/eRPC/scripts/setup_dpdk$ sudo ./cloudlab.sh

modprobe: FATAL: Module igb_uio not found in directory /lib/modules/4.15.0-169-generic Binding interfaces ens1f1 to DPDK Warning: no supported DPDK kernel modules are loaded Error: Driver 'igb_uio' is not loaded. ` Do you have some clue for this? Thanks.

ankalia commented 1 year ago

Hi Xiaosu,

Here are some suggestions if you're still working on this:

lyuxiaosu commented 1 year ago

Hi Ankalia, thanks for the suggestion. I tested testpmd and it doesn't work, it seems the Mellanox driver is not effective. The problem finally was solved by installing OFED first and then rebuild everything. I have made the hello world app works.

The current problem is I cannot use the same NIC port for both DPDK and standard UDP (go through kernel), do you know Mellanox NIC can be used both for standard UDP and DPDK? My temporary solution is I used two ports, one for stand UDP, the other for DPDK, but I hope it can use the same port, do you know about this? Thanks very much.

ankalia commented 1 year ago

Generally, we don't use the same NIC port for both DPDK and kernel traffic because DPDK takes over the NIC.

You can use the same physical NIC port if you create a separate "SR-IOV" virtual function (basically a virtual NIC) for DPDK.

lyuxiaosu commented 1 year ago

Great to know this, I will search more about SR-IOV, thanks for your help:)