Closed tsender closed 2 years ago
Thanks for reporting. When you are running on Windows, would it be possible to run the profiler/debugger of Visual Studio to get some more information on the memory allocation/deallocations and get some candidates for the memory leaks? I think it would also be interesting if the memory leak does indeed scale with the publish or receive rate on Topics.
From the top of my head, i see a potential for missing de-allocations in https://github.com/code-iai/ROSIntegration/blob/master/Source/ROSIntegration/Private/rosbridge2cpp/ros_bridge.cpp , https://github.com/code-iai/ROSIntegration/blob/master/Source/ROSIntegration/Private/rosbridge2cpp/TCPConnection.cpp or in the Converters of the individual messages.
I tried messing around with the VS debugger to see how it works, but I unfortunately don't have the kind of time to look deeply into this issue. If I have any spare time, I will try to look into the memory allocations for certain parts of the code, but it is unlikely I'll be able to do much.
I actually did find some time to look into this. Here are my observations from using the Visual Studio memory profiler:
Let X denote the number of bytes that a given bson_t message takes up in memory once populated.
msg.ToBSON(*message)
call
Aside from that weird case where an outer function shows zero memory allocations but the inner function does, everything seems to add up (unless that is the problem?), but strangely there is still a memory leak somewhere.
Does this give you any ideas?
Thanks for the report. This can make us cautiously optimistic that the sending part is not the problem. Do you know if the leaked memory maybe scales with the frequency of received messages on a topic? Maybe the faulty memory allocation is happening in that part.
Well, I am still quite confident that publishing messages does have a memory leak.
In all of my experiments, I only tested publishers (I did have subscribers active, but I know they were not receiving anything). In one experiment, I commented out all lines to a UTopic;:Publish command and I let the code run for 24 hours to see what would happen. After 24 hrs there was no discernable memory leak. But as soon as I uncommented those publish commands, memory starts to leak. That is the only proof I have for existence of the memory leak.
I simply cannot track down the leak because I am calling the UTopic::Publish() command in the game thread, but that message gets published in a different thread created in your code.
I've done a bit more testing. After each call to the UTopic::Publish() method, X kilobytes are added to the heap memory (note that X is dependent on the type of message and amount of data contained in the message). However, when that message is ready to be sent over the ROS network in the ROSBridge::RunPublisherQueueThread() function, after the message is sent and is subsequently destroyed with the bson_destroy(msg)
call, only X - 0.13 KB of memory are freed. I have tested this with several types of messages and it seems that when publishing there is always a constant 0.13 KB of memory left over that are never freed per message (regardless of message type). Since this is per message, this will certainly scale with the frequency of publishing.
I did notice one weird thing: When publishing the /clock message, each call to the UTopic::Publish() for clock messages adds exactly 0.13 KB of memory to the heap, and when these messages get sent over the ROS network, 0 kilobytes of memory are freed. Even though this number is 0.13 KB, it might just be a coincidence because not every message contains a clock/timestamp field.
I certainly cannot figure out where this 0.13 KB is coming from and why it is not being freed, but perhaps you will be able to figure it out.
So, I actually was able to track down the issue and figured out what exactly was causing the memory leak. Long story short, bson_new()
and bson_init(message)
do two different things: bson_new()
initializes a new bson_t structure on the heap, whereas bson_init(message)
initializes a new bson_t structure on the stack. Also, the BCON_NEW(...)
command is another way to initialize a new bson_t structure on the heap. Further, there were numerous places where I saw something like
bson_t* message = new bson_t; // This creates an uninitialized bson_t on the heap
bson_init(message); // This then initializes another bson_t, but on the stack.
or
bson_t* message = bson_new(); // This creates an initialized bson_t on the heap
bson_init(message); // This then initializes another bson_t, but on the stack.
In both scenarios, calling bson_destroy(message)
following either code block only appears to destroy one of the two created bson_t structures (my guess is the last one created), which leads to a memory leak.
I forked the repo and modified the entire plugin to account for the fix (I decided that all bson_t structures should be created only via bson_new()
, it also makes it easier to debug with the VS memory profiler). I also cleaned up a few other things and added a few new messages. I do still see a small memory leak from one of my service calls, but I can't quite tell what could be causing the memory to go up ever so slightly. Either way, I fixed the vast majority of the problem. I'll do a bit more testing before I submit a pull request with all the changes.
Thanks for the informative update and the good news! Looking forward to your PR.
It seems that I cannot identify the cause for the memory leak with my service calls. I have traced all the lines dealing with bson_t structures and all the memory appears to be accounted for (anything that's created gets fully destroyed), but I still see the memory slightly increase with each service call. Since most people primarily work with publishers/subscribers, this should not be an issue since I was able to fix the memory leak on that front. I'll submit my PR with all the changes I could make.
I am going to keep this issue open for now until I or someone else can determine the cause for the service call memory leak.
Do you plan to accept my PR soon?
I'm closing this issue because I do not believe this to be a problem anymore.
I have recently noticed that this plugin appears to have a memory leak. I have currently been experiencing a leak rate of about 250 MB / hour (this rate may be specific to my project or depend on how much data one is publishing/subscribing with this plugin). While this is not a substantial rate over a short period of time, since I need to run simulations for several days straight to collect data for a neural network, this certainly adds up.
I have verified that if I do not run this plugin in my code, then I do not see any leak rate. However, when I enable this plugin and use it for publishing and subscribing, then I see a memory leak. I have experimented with only publishing messages and only subscribing to messages to see if there was a difference, but I still seem to see a memory leak.
Any thoughts on why this may be occurring? For reference, I am currently running UE 4.26 on Windows 10. Thanks.