jhelovuo / RustDDS

Rust implementation of Data Distribution Service
Apache License 2.0
316 stars 65 forks source link

High latency for large messages #319

Open ariaameri opened 7 months ago

ariaameri commented 7 months ago

Hi.

First of all, I want to thank you so much for this implementation and making it open source!

I have been playing with different solutions for DDS to compare them for my needs. I have been using eProsima FastDDS and porting it with the cxx crate to Rust and got good result; however, it is cumbersome to always writing the "bridge" interface between C++ and Rust. So, I decided to give RustDDS a shot.

I have noticed that the performance (in terms of latency) of UDP-based inter-process RustDDS communication on a single host is pretty much comparable with that of FastDDS when the message size is small. However, as the message size gets larger, performance starts to drop drastically. On my machine, RustDDS gives a latency of about 1.4ms for transferring 64KB messages (while eProsima FastDDS has a latency of 259us), and 36ms for transferring 1MB messages (eProsima FastDDS does that in about 750us) and fails completely for messages of size 10MB, while FastDDS has a latency of about 5ms. These large messages, for example, can be data of images or videos that are being shared among participants. It should also be noted that in order to get this performance out of FastDDS, I had to change the buffer size of the UDP socket to a much larger value---about 1MB for writes and 4MB for reads. I tried looking into the code of RustDDS and changed it to have the same buffer size values utilizing Socket2 and changing the fragmentation udp packet size from the hard-coded 1kb to somewhere around the maximum of 64kb. Using these tricks, I managed to get the RustDDS latency for 1MB messages down to about 8ms (from 36ms). But, this is still very much slower than the 775us of FastDDS.

I really prefer to use a pure Rust implementation of DDS like RustDDS, and not have to do a lot of bridging between C++'s FastDDS and Rust. So, I was wondering if you know why this is happening and what is the main source of this huge drop in performance?

Thank you!

jhelovuo commented 7 months ago

Hello, and thank you for sharing your experimental results.

The most likely reason for for such a difference is that within a single host, FastDDS uses shared-memory transport , which bypasses the UDP/IP network stack completely.

RustDDS implements only UDP transport, so you are comparing different mechanisms. Shared memory can transfer quite large objects without RTPS fragmentation, which makes the difference even greater.

RustDDS could be extended to implement shared memory also, but that is a non-trivial amount of work.

If you are interested in actual over-the-network comparison, then I suggest using two separate hosts to test.