erpc-io / eRPC

Efficient RPCs for datacenter networks
https://erpc.io/
Other
835 stars 137 forks source link

Failure of creating 64 sessions to the server side #115

Open lyuxiaosu opened 1 month ago

lyuxiaosu commented 1 month ago

Hi Anuj,

I tried to create 64 sessions on the client side to the server side. num_server_threads is 1. I tuned some parameters on eRPC, but still doesn't work. In the server side, I tuned the following parameters:

kNumRxRingEntries=16384
kMaxQueuesPerPort=64
kNumTxRingDesc=4096
kSessionCredits=256
kSessionReqWindow=256

In the client side, I tuned the following parameters:

kNumRxRingEntries=4096
kMaxQueuesPerPort=64
kNumTxRingDesc=4096
kSessionCredits=1024
kSessionReqWindow=1024

On both side, I set the number of hugepages to 4096 with:

sudo bash -c "echo 4096 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages"

When I started the test, the client printed out the log showing it sent out the correct packets:

Transport: TX (idx = 0, drop = 0). pkthdr = [type REQ, dsn 0, reqn 1024, pktn 0, msz 4, req type 18, magic 11]. Frame  = [ETH: dst 1c:34:da:41:d2:c4, src 1c:34:da:41:ce:f4, eth_type 2048], [IPv4: ihl 5, version 4, ecn 1, tot_len 46, id 0, frag_off 0, ttl 128, protocol 17, check 26877, src IP 1.2.3.4, dst IP 1.2.3.5], [UDP: src_port 10042, dst_port 10000, len 26, check 0].
Transport: TX (idx = 1, drop = 0). pkthdr = [type REQ, dsn 0, reqn 1025, pktn 0, msz 4, req type 18, magic 11]. Frame  = [ETH: dst 1c:34:da:41:d2:c4, src 1c:34:da:41:ce:f4, eth_type 2048], [IPv4: ihl 5, version 4, ecn 1, tot_len 46, id 0, frag_off 0, ttl 128, protocol 17, check 26877, src IP 1.2.3.4, dst IP 1.2.3.5], [UDP: src_port 10042, dst_port 10000, len 26, check 0].
Transport: TX (idx = 0, drop = 0). pkthdr = [type REQ, dsn 0, reqn 1026, pktn 0, msz 4, req type 18, magic 11]. Frame  = [ETH: dst 1c:34:da:41:d2:c4, src 1c:34:da:41:ce:f4, eth_type 2048], [IPv4: ihl 5, version 4, ecn 1, tot_len 46, id 0, frag_off 0, ttl 128, protocol 17, check 26877, src IP 1.2.3.4, dst IP 1.2.3.5], [UDP: src_port 10042, dst_port 10000, len 26, check 0].
Transport: TX (idx = 0, drop = 0). pkthdr = [type REQ, dsn 20, reqn 1024, pktn 0, msz 4, req type 1, magic 11]. Frame  = [ETH: dst 1c:34:da:41:d2:c4, src 1c:34:da:41:ce:f4, eth_type 2048], [IPv4: ihl 5, version 4, ecn 1, tot_len 46, id 0, frag_off 0, ttl 128, protocol 17, check 26877, src IP 1.2.3.4, dst IP 1.2.3.5], [UDP: src_port 10042, dst_port 10000, len 26, check 0].
Transport: TX (idx = 0, drop = 0). pkthdr = [type REQ, dsn 20, reqn 1024, pktn 0, msz 4, req type 1, magic 11]. Frame  = [ETH: dst 1c:34:da:41:d2:c4, src 1c:34:da:41:ce:f4, eth_type 2048], [IPv4: ihl 5, version 4, ecn 1, tot_len 46, id 0, frag_off 0, ttl 128, protocol 17, check 26877, src IP 1.2.3.4, dst IP 1.2.3.5], [UDP: src_port 10042, dst_port 10000, len 26, check 0].
Transport: TX (idx = 0, drop = 0). pkthdr = [type REQ, dsn 20, reqn 1025, pktn 0, msz 4, req type 1, magic 11]. Frame  = [ETH: dst 1c:34:da:41:d2:c4, src 1c:34:da:41:ce:f4, eth_type 2048], [IPv4: ihl 5, version 4, ecn 1, tot_len 46, id 0, frag_off 0, ttl 128, protocol 17, check 26877, src IP 1.2.3.4, dst IP 1.2.3.5], [UDP: src_port 10042, dst_port 10000, len 26, check 0].
Transport: TX (idx = 0, drop = 0). pkthdr = [type REQ, dsn 0, reqn 1027, pktn 0, msz 4, req type 18, magic 11]. Frame  = [ETH: dst 1c:34:da:41:d2:c4, src 1c:34:da:41:ce:f4, eth_type 2048], [IPv4: ihl 5, version 4, ecn 1, tot_len 46, id 0, frag_off 0, ttl 128, protocol 17, check 26877, src IP 1.2.3.4, dst IP 1.2.3.5], [UDP: src_port 10042, dst_port 10000, len 26, check 0].
Transport: TX (idx = 0, drop = 0). pkthdr = [type REQ, dsn 20, reqn 1026, pktn 0, msz 4, req type 1, magic 11]. Frame  = [ETH: dst 1c:34:da:41:d2:c4, src 1c:34:da:41:ce:f4, eth_type 2048], [IPv4: ihl 5, version 4, ecn 1, tot_len 46, id 0, frag_off 0, ttl 128, protocol 17, check 26877, src IP 1.2.3.4, dst IP 1.2.3.5], [UDP: src_port 10042, dst_port 10000, len 26, check 0].
Transport: TX (idx = 0, drop = 0). pkthdr = [type REQ, dsn 0, reqn 1028, pktn 0, msz 4, req type 18, magic 11]. Frame  = [ETH: dst 1c:34:da:41:d2:c4, src 1c:34:da:41:ce:f4, eth_type 2048], [IPv4: ihl 5, version 4, ecn 1, tot_len 46, id 0, frag_off 0, ttl 128, protocol 17, check 26877, src IP 1.2.3.4, dst IP 1.2.3.5], [UDP: src_port 10042, dst_port 10000, len 26, check 0].
Transport: TX (idx = 0, drop = 0). pkthdr = [type REQ, dsn 0, reqn 1024, pktn 0, msz 4, req type 18, magic 11]. Frame  = [ETH: dst 1c:34:da:41:d2:c4, src 1c:34:da:41:ce:f4, eth_type 2048], [IPv4: ihl 5, version 4, ecn 1, tot_len 46, id 0, frag_off 0, ttl 128, protocol 17, check 26877, src IP 1.2.3.4, dst IP 1.2.3.5], [UDP: src_port 10042, dst_port 10000, len 26, check 0].
Transport: TX (idx = 0, drop = 0). pkthdr = [type REQ, dsn 0, reqn 1025, pktn 0, msz 4, req type 18, magic 11]. Frame  = [ETH: dst 1c:34:da:41:d2:c4, src 1c:34:da:41:ce:f4, eth_type 2048], [IPv4: ihl 5, version 4, ecn 1, tot_len 46, id 0, frag_off 0, ttl 128, protocol 17, check 26877, src IP 1.2.3.4, dst IP 1.2.3.5], [UDP: src_port 10042, dst_port 10000, len 26, check 0].

but on the server side, it seems all received packets are wrong and dropped:

Rpc 0: Received [type REQ, dsn 0, reqn 0, pktn 0, msz 0, req type 0, magic 0] with invalid magic. Packet headroom = [ETH: dst 0:0:0:0:0:0, src 0:0:0:0:0:0, eth_type 0], [IPv4: ihl 0, version 0, ecn 0, tot_len 0, id 0, frag_off 0, ttl 0, protocol 0, check 0, src IP 0.0.0.0, dst IP 0.0.0.0], [UDP: src_port 0, dst_port 0, len 0, check 0]. Dropping.
Rpc 0: Received [type REQ, dsn 0, reqn 0, pktn 0, msz 0, req type 0, magic 0] with invalid magic. Packet headroom = [ETH: dst 0:0:0:0:0:0, src 0:0:0:0:0:0, eth_type 0], [IPv4: ihl 0, version 0, ecn 0, tot_len 0, id 0, frag_off 0, ttl 0, protocol 0, check 0, src IP 0.0.0.0, dst IP 0.0.0.0], [UDP: src_port 0, dst_port 0, len 0, check 0]. Dropping.
Rpc 0: Received [type REQ, dsn 0, reqn 0, pktn 0, msz 0, req type 0, magic 0] with invalid magic. Packet headroom = [ETH: dst 0:0:0:0:0:0, src 0:0:0:0:0:0, eth_type 0], [IPv4: ihl 0, version 0, ecn 0, tot_len 0, id 0, frag_off 0, ttl 0, protocol 0, check 0, src IP 0.0.0.0, dst IP 0.0.0.0], [UDP: src_port 0, dst_port 0, len 0, check 0]. Dropping.
Rpc 0: Received [type REQ, dsn 0, reqn 0, pktn 0, msz 0, req type 0, magic 0] with invalid magic. Packet headroom = [ETH: dst 0:0:0:0:0:0, src 0:0:0:0:0:0, eth_type 0], [IPv4: ihl 0, version 0, ecn 0, tot_len 0, id 0, frag_off 0, ttl 0, protocol 0, check 0, src IP 0.0.0.0, dst IP 0.0.0.0], [UDP: src_port 0, dst_port 0, len 0, check 0]. Dropping.
Rpc 0: Received [type REQ, dsn 0, reqn 0, pktn 0, msz 0, req type 0, magic 0] with invalid magic. Packet headroom = [ETH: dst 0:0:0:0:0:0, src 0:0:0:0:0:0, eth_type 0], [IPv4: ihl 0, version 0, ecn 0, tot_len 0, id 0, frag_off 0, ttl 0, protocol 0, check 0, src IP 0.0.0.0, dst IP 0.0.0.0], [UDP: src_port 0, dst_port 0, len 0, check 0]. Dropping.
Rpc 0: Received [type REQ, dsn 0, reqn 0, pktn 0, msz 0, req type 0, magic 0] with invalid magic. Packet headroom = [ETH: dst 0:0:0:0:0:0, src 0:0:0:0:0:0, eth_type 0], [IPv4: ihl 0, version 0, ecn 0, tot_len 0, id 0, frag_off 0, ttl 0, protocol 0, check 0, src IP 0.0.0.0, dst IP 0.0.0.0], [UDP: src_port 0, dst_port 0, len 0, check 0]. Dropping.
Rpc 0: Received [type REQ, dsn 0, reqn 0, pktn 0, msz 0, req type 0, magic 0] with invalid magic. Packet headroom = [ETH: dst 0:0:0:0:0:0, src 0:0:0:0:0:0, eth_type 0], [IPv4: ihl 0, version 0, ecn 0, tot_len 0, id 0, frag_off 0, ttl 0, protocol 0, check 0, src IP 0.0.0.0, dst IP 0.0.0.0], [UDP: src_port 0, dst_port 0, len 0, check 0]. Dropping.
Rpc 0: Received [type REQ, dsn 0, reqn 0, pktn 0, msz 0, req type 0, magic 0] with invalid magic. Packet headroom = [ETH: dst 0:0:0:0:0:0, src 0:0:0:0:0:0, eth_type 0], [IPv4: ihl 0, version 0, ecn 0, tot_len 0, id 0, frag_off 0, ttl 0, protocol 0, check 0, src IP 0.0.0.0, dst IP 0.0.0.0], [UDP: src_port 0, dst_port 0, len 0, check 0]. Dropping.
Rpc 0: Received [type REQ, dsn 0, reqn 0, pktn 0, msz 0, req type 0, magic 0] with invalid magic. Packet headroom = [ETH: dst 0:0:0:0:0:0, src 0:0:0:0:0:0, eth_type 0], [IPv4: ihl 0, version 0, ecn 0, tot_len 0, id 0, frag_off 0, ttl 0, protocol 0, check 0, src IP 0.0.0.0, dst IP 0.0.0.0], [UDP: src_port 0, dst_port 0, len 0, check 0]. Dropping.
Rpc 0: Received [type REQ, dsn 0, reqn 0, pktn 0, msz 0, req type 0, magic 0] with invalid magic. Packet headroom = [ETH: dst 0:0:0:0:0:0, src 0:0:0:0:0:0, eth_type 0], [IPv4: ihl 0, version 0, ecn 0, tot_len 0, id 0, frag_off 0, ttl 0, protocol 0, check 0, src IP 0.0.0.0, dst IP 0.0.0.0], [UDP: src_port 0, dst_port 0, len 0, check 0]. Dropping.
Rpc 0: Received [type REQ, dsn 0, reqn 0, pktn 0, msz 0, req type 0, magic 0] with invalid magic. Packet headroom = [ETH: dst 0:0:0:0:0:0, src 0:0:0:0:0:0, eth_type 0], [IPv4: ihl 0, version 0, ecn 0, tot_len 0, id 0, frag_off 0, ttl 0, protocol 0, check 0, src IP 0.0.0.0, dst IP 0.0.0.0], [UDP: src_port 0, dst_port 0, len 0, check 0]. Dropping.

I spent much time to figure this out, but failed. Is there some parameters I changed wrong or not changed that causes this issue? The NIC card I used is Mellanox CX5. Thanks for your help.

lyuxiaosu commented 1 month ago

I forgot to mention that the DPDK version I used is 19.11.5

lyuxiaosu commented 1 month ago

Not sure if it is related to Mellanox CX5 NIC. This said Mellanox CX5 will drop packets when RX queues greater than 32, and in my case, it will create 64 RX queues. It works very well with kMaxQueuesPerPort=32 too.

ankalia commented 1 month ago

Thanks for bringing it up. Do eRPC's examples and benchmarks (e.g., hello_world, small_rpc_tput) work in your cluster?

lyuxiaosu commented 1 month ago

Thanks for bringing it up. Do eRPC's examples and benchmarks (e.g., hello_world, small_rpc_tput) work in your cluster?

Thanks for bringing it up. Do eRPC's examples and benchmarks (e.g., hello_world, small_rpc_tput) work in your cluster?

Yes, I tested hello_world, latency and server_rate, these works very well. I didn't try small_rpc_tput.