jhelovuo / RustDDS

Rust implementation of Data Distribution Service
Apache License 2.0
316 stars 65 forks source link

Identified multiple memory leaks in listeners #325

Closed Valts-M closed 5 months ago

Valts-M commented 6 months ago

I've Identified at least 3 memory leaks in the codebase.

  1. When a new assembly buffer is created in the rtps reader https://github.com/jhelovuo/RustDDS/blob/80f139a74795b72e963bd267df39dbc3be02b524/src/rtps/fragment_assembler.rs#L176-L179

What would happen is that after a datafrag is compete and removed from the assembly_buffers tree I would receive 3 more datafrags with the same writer_sn which would create a new assembly_buffer that would never get finished. I don't know why I receive 3 more datafrags then I am supposed to.

  1. When changes are added to the dds_cache

https://github.com/jhelovuo/RustDDS/blob/80f139a74795b72e963bd267df39dbc3be02b524/src/structure/dds_cache.rs#L202

https://github.com/jhelovuo/RustDDS/blob/80f139a74795b72e963bd267df39dbc3be02b524/src/structure/dds_cache.rs#L329-L343

Since Timestamp::ZERO is passed to the remove_changes_before function and since min and max remove count is the same the condition for skip_while is never false, meaning the cache is never cleared.

  1. When a receive buffer is created for the UDP listener

https://github.com/jhelovuo/RustDDS/blob/80f139a74795b72e963bd267df39dbc3be02b524/src/network/udp_listener.rs#L176-L187

I didn't have time to figure out where a dangling reference is left to the ByteMut here. It is possible that it is the same as point 2. To emulate this one you can just run the no_key_async_usage_example, I just made the publisher publish way faster.

I'll attach some heaptrack files if you want to go through them. I had to zip them since github doesn't support zst

heaptrack.no_key_async_usage_example.1311794.zip

heaptrack.ros_api.1364577.zip

jhelovuo commented 6 months ago

I've Identified at least 3 memory leaks in the codebase.

These seem like a very good catch! Thank you for your effort.

  1. When a new assembly buffer is created in the rtps reader

This seems likely easiest one to fix. The proper logic would be to simply not create an AssemblyBuffer, if that SequenceNumber has already been received.

  1. When changes are added to the dds_cache

  2. When a receive buffer is created for the UDP listener

Like you speculate, these could very well be symptoms of the same leak.

The likely cause is that references to receive buffers (via Bytes) are kept as long as any sample in a buffer is in the DDS Cache. One possible fix would be to periodically clean the cache by identifying samples that have lived for a (relatively) long time, e.g. 20 seconds, and replacing their contents buffer (a Bytes buried inside a CacheChange) by a freshly-allocated copy. This should release the original receive buffer allocated in UDPListener.

I am not sure how Heaptrack decides what is a leak and what is not, but the DDS cache may grow without bound, if its not controlled by QoS settings History (use KeepLast instead of KeepAll) and Resource Limits.

I cannot immediately do anything about these, but we'll return to this issue when time permits.

jhelovuo commented 6 months ago

This should now be fixed in the latest master. Please see if it works for you now, and report the result.

jhelovuo commented 5 months ago

Fixes released in 0.9.1. Please (re)open an issue if you detect more memory leaks.