Closed Valts-M closed 5 months ago
I've Identified at least 3 memory leaks in the codebase.
These seem like a very good catch! Thank you for your effort.
- When a new assembly buffer is created in the rtps reader
This seems likely easiest one to fix. The proper logic would be to simply not create an AssemblyBuffer, if that SequenceNumber has already been received.
When changes are added to the dds_cache
When a receive buffer is created for the UDP listener
Like you speculate, these could very well be symptoms of the same leak.
The likely cause is that references to receive buffers (via Bytes
) are kept as long as any sample in a buffer is in the DDS Cache. One possible fix would be to periodically clean the cache by identifying samples that have lived for a (relatively) long time, e.g. 20 seconds, and replacing their contents buffer (a Bytes
buried inside a CacheChange
) by a freshly-allocated copy. This should release the original receive buffer allocated in UDPListener
.
I am not sure how Heaptrack decides what is a leak and what is not, but the DDS cache may grow without bound, if its not controlled by QoS settings History (use KeepLast instead of KeepAll) and Resource Limits.
I cannot immediately do anything about these, but we'll return to this issue when time permits.
This should now be fixed in the latest master
. Please see if it works for you now, and report the result.
I've Identified at least 3 memory leaks in the codebase.
What would happen is that after a datafrag is compete and removed from the assembly_buffers tree I would receive 3 more datafrags with the same writer_sn which would create a new assembly_buffer that would never get finished. I don't know why I receive 3 more datafrags then I am supposed to.
https://github.com/jhelovuo/RustDDS/blob/80f139a74795b72e963bd267df39dbc3be02b524/src/structure/dds_cache.rs#L202
https://github.com/jhelovuo/RustDDS/blob/80f139a74795b72e963bd267df39dbc3be02b524/src/structure/dds_cache.rs#L329-L343
Since Timestamp::ZERO is passed to the remove_changes_before function and since min and max remove count is the same the condition for skip_while is never false, meaning the cache is never cleared.
https://github.com/jhelovuo/RustDDS/blob/80f139a74795b72e963bd267df39dbc3be02b524/src/network/udp_listener.rs#L176-L187
I didn't have time to figure out where a dangling reference is left to the ByteMut here. It is possible that it is the same as point 2. To emulate this one you can just run the no_key_async_usage_example, I just made the publisher publish way faster.
I'll attach some heaptrack files if you want to go through them. I had to zip them since github doesn't support zst
heaptrack.no_key_async_usage_example.1311794.zip
heaptrack.ros_api.1364577.zip