erpc-io / eRPC

Efficient RPCs for datacenter networks
https://erpc.io/
Other
835 stars 137 forks source link

Using Modded Driver With CX5 (Eth) #45

Closed theojepsen closed 2 years ago

theojepsen commented 3 years ago

Hi,

I'm running the latency app, and the latency seems rather high. The eRPC paper reports ~2.3us median latency with CX5 (Eth), but I'm getting ~5.3us:

$ ./scripts/do.sh 1 0
Installing modded drivers
do.sh: Launching process 1 on NUMA node 0
39:883042 WARNG: Modded driver unavailable. Performance will be low.
Process 1: Creating session to 10.0.1.98:31850.
Process 1: Session connected. Starting work.
write_size median_us 5th_us 99th_us 999th_us
64 5.3 5.1 6.4 9.2
128 5.3 5.1 6.4 9.2
256 5.4 5.2 6.5 9.4
512 5.5 5.3 6.7 9.4
1024 7.7 7.4 9.3 12.2
64 5.2 5.1 6.2 9.2
128 5.3 5.1 6.2 9.2
256 5.3 5.2 6.2 9.1
512 5.5 5.3 6.4 9.7
1024 7.7 7.4 9.0 12.8
64 5.2 5.1 6.2 9.5
128 5.3 5.1 6.2 9.7
256 5.3 5.2 6.2 9.8
512 5.5 5.3 6.4 9.8
1024 7.6 7.4 8.8 13.0
64 5.2 5.1 6.1 9.7
128 5.3 5.1 6.2 9.9
256 5.3 5.2 6.3 9.8
512 5.5 5.3 6.5 10.0
1024 7.6 7.4 8.9 13.6

I'm running this on two servers, each with 12 Intel E5-2603 v3 @ 1.60GHz and CX5 (Eth) NICs:

- MLNX_OFED_LINUX_VERSION: 4.7-1.0.0.1
- MLNX_OFED_ARCH: x86_64
- MLNX_OFED_DISTRO: ubuntu16.04
- distro: ubuntu16.04
- arch: x86_64
- kernel: 4.15.0-65-generic

This is how I configured and built eRPC:

echo latency > scripts/autorun_app_file
cmake . -DPERF=ON -DTRANSPORT=infiniband -DROCE=on -DLOG_LEVEL=warn
make -j12

I tried to use the modded driver. I built the driver with the following steps:

cd drivers/4.4/libmlx5-41mlnx1
./autogen.sh
./configure
make
./update-driver.sh

I see that this replaced /usr/lib/libmlx5.so.1.0.0 with the modded driver that was just built in drivers/4.4/libmlx5-41mlnx1/src/.libs/libmlx5.so.1.0.0.

However, when I try running eRPC again (the latency app), it warns that the modded driver is unavailable, and throws a fatal error:

$ ./scripts/do.sh 1 0
Installing modded drivers
do.sh: Launching process 0 on NUMA node 0
22:288858 WARNG: Modded driver unavailable. Performance will be low.
eRPC: Fatal error. Bad wc status 4231843.

Why doesn't eRPC detect the modded driver? Did I miss any steps? Do you have any suggestions for troubleshooting?

Thank you!

anujkaliaiitd commented 3 years ago

Could you please share what network switch you are using?

The modded driver supports only the "raw" transport (I guess I haven't documented this limitation anywhere). Could you please try with: cmake . -DPERF=ON -DTRANSPORT=raw -DLOG_LEVEL=warn.

You can also set kMaxInline = 128 in raw_transport.h to improve latency a bit

theojepsen commented 3 years ago

Thank you for your response, @anujkaliaiitd .

I had not understood that the modded driver was for "raw" transport. Why does eRPC complain WARNG: Modded driver unavailable. when it's compiled with "infiniband" (ROCE) transport if it doesn't use the modded driver. Shouldn't it print this warning only when compiled with "raw" transport?

I tried building with -DTRANSPORT=raw, set kMaxInline = 128, and ran again:

$ ./scripts/do.sh 1 0                                                          
Installing modded drivers
do.sh: Launching process 1 on NUMA node 0
88:756042 WARNG: Installing flow rule for Rpc 0. NUMA node = 0. Flow RX UDP port = 31882.
88:769578 WARNG: RawTransport created for Rpc ID 0. Device mlx5_0/enp131s0f0, port 1. IPv4 10.0.1.96, MAC 98:3:9b:67:f5:c6. Datapath UDP port 31882.
Process 1: Creating session to 10.0.1.98:31850.
Process 1: Session connected. Starting work.
write_size median_us 5th_us 99th_us 999th_us
64 4.0 3.9 4.5 7.3
128 4.5 4.3 4.9 7.8
256 4.5 4.4 5.0 7.7
512 4.7 4.5 5.1 8.7
1024 6.0 5.8 6.5 9.9
64 4.0 3.9 4.5 7.6
128 4.5 4.3 5.0 7.7
256 4.5 4.4 5.0 7.6
512 4.7 4.5 5.1 8.8
1024 6.0 5.8 6.8 9.4

Is this what you'd expect? Shouldn't the latency be lower when using infiniband ROCE, as reported in the paper?

P.S. I'm using a Tofino switch that just forwards the packet and has FEC disabled.