OpenFTC / EasyOpenCV

Finally, a straightforward and easy way to use OpenCV on an FTC robot!
219 stars 100 forks source link

False memory leak warning #63

Closed FromenActual closed 1 year ago

FromenActual commented 1 year ago

TLDR; the leakDetection() in OpenCvPipeline.java is not quite correct, the garbage collection offset calculation is wrong.

We found during our competition last weekend, this memory leak warning came up while sitting in init() for exceptionally long times. Today I let it sit for about 25 minutes, and it claimed over 20GB of RAM was leaked. I believe the Control Hubs only have 1GB of RAM, so there's no way this is possible.

image

Looking at the Profiler in Android Studio shows the garbage collector is actually doing its job, and never exceeding ~400MB.

android_ram

So looks like the garbage collection offset in that code is not correct, needs to be updated.

Windwoes commented 1 year ago

The telemetry message is claiming the GC ran 91 times, though, so it does know the GC is running. The total amount leaked is the running sum that doesn't take into account what the GC reclaims. Even if the GC is doing its job, it is not something you should count on, and you should update your code to reuse framebuffers.

Windwoes commented 1 year ago

Note when I say don't count on it: there are some peculiarities with OpenCV in Java which results in the GC not always knowing that it needs to run. The "big" part of the Mat object is a native allocation, not a Java one. I have personally seen OOMEs very often with OpenCV on Android where the GC does not act fast enough.

FromenActual commented 1 year ago

Ah, okay! I didn't realize it was logging the total memory ignoring the GC, I thought it was the total memory right now. I personally think the latter would be more useful; if the GC is doing its job, then it doesn't really matter if 20GB of total memory has been leaked. But if the GC isn't working fast enough, you could easily tell from the telemetry (eg. more than ~800MB used right now). IMO gcLeakOffsetMb should be removed, but curious to hear counter arguments.

I'm no Java/GC expert, so just to make sure I'm clear, you're saying instead of doing this:

@Override
public Mat processFrame(Mat input)
{
    Mat output = // whatever
    return output;
}

Do this instead:

Mat output;

@Override
public Mat processFrame(Mat input)
{
    output = // whatever
    return output;
}

Correct? If so, I would request the telemetry message be updated to make that more clear. I've been volunteering as an FTA this season, and have seen this warning from multiple teams using EasyOpenCV; that indicates to me that teams don't understand the warning and how to fix it. I do see this comment about it, but I'd bet many people haven't actually read it (I didn't until now!). Maybe add a helpful telemetry message like "Don't create Mat objects in processFrame()" or something like that so teams can figure it out quickly.

Windwoes commented 1 year ago

IMO gcLeakOffsetMb should be removed, but curious to hear counter arguments.

I'm not really sure if I have a particularly strong counter argument to offer. The never going above 400MB you mentioned is for the RC app alone I believe, I don't think it takes into account the memory used by the OS. So if we only show native heap allocation size, it might be deceiving, e.g. crash happens at 500MB used. But, yeah, the total memory ignoring the GC itself is probably not a very useful statistic. I think the reason I added it was because I use that to calculate the guesstimated average leak rate, so it was already there and I just put it on telemetry too.

I'm no Java/GC expert, so just to make sure I'm clear, you're saying instead of doing this: Do this instead:

Well, not exactly. In your second example, you are still reassigning the "output" object "pointer" (yeah yeah Java doesn't have pointers but that's basically what's going on) and by doing that the previous object that it was pointing to becomes garbage. So really, you want to avoid new()ing Mats, AND you want to avoid reassigning Mats.

If you do, for example:

Mat tmp;

@Override
public Mat processFrame(Mat input)
{
    input.copyTo(tmp);
    return tmp;
}

Now the object that tmp is pointing to is always the same.

I would request the telemetry message be updated to make that more clear. I've been volunteering as an FTA this season, and have seen this warning from multiple teams using EasyOpenCV; that indicates to me that teams don't understand the warning and how to fix it.

Teams ignore even very clear warnings too :see_no_evil: for example my old team thought that the "OpMode Force Stopped" popup was "normal" when you press stop while an OpMode is running. But that being said yes, we can make the warning more helpful. Maybe something like Do not create new Mats or re-assign Mat objects inside processFrame()?

NoahAndrews commented 1 year ago

Do not create new Mats or re-assign Mat objects inside processFrame()

It's not so much that you're reassigning the object, but the variable. I think using the term variable here makes more sense and is clearer.

FromenActual commented 1 year ago

I think the reason I added it was because I use that to calculate the guesstimated average leak rate, so it was already there and I just put it on telemetry too.

That's fair, the rate could be useful to know. I think that's reason enough to keep gcLeakOffsetMb, but just not use it for the telemetry.

I don't think it takes into account the memory used by the OS.

Good point. In that case, it'd probably be helpful to track the total system memory usage; looks like there's a couple solutions offered here. Then the telemetry could be modified to show the total free memory or something? Idk, just something to indicate whether the GC is doing its job.

Teams ignore even very clear warnings too

Very true! I can't tell you how many times a student has called me over because they didn't read the big red error message that tells them exactly what to do.

That message seems fine to me. I'm not too picky about it, as long as it points users in the right direction.

So really, you want to avoid new()ing Mats, AND you want to avoid reassigning Mats.

Gotcha, thanks for the clarification! Although I don't quite see how to avoid creating new Mats with some of OpenCV's functions, like cvtColor() or inRange(). Those take a source and destination Mat, and since the source Mat isn't modified, it must be creating a new Mat for the destination, right? Is there a way around this? If not, I don't see how to avoid leaking memory.

Side note - I'm not sure "leak" is the right word to use here. As I understand it, memory leaks occur in Java when the GC can't remove unused objects from memory (source). In this case, the extra Mats are being cleaned up by the GC, so it's not really a "leak".

Windwoes commented 1 year ago

It's not so much that you're reassigning the object, but the variable. I think using the term variable here makes more sense and is clearer.

Valid

Windwoes commented 1 year ago

Good point. In that case, it'd probably be helpful to track the total system memory usage; looks like there's a couple solutions offered here. Then the telemetry could be modified to show the total free memory or something? Idk, just something to indicate whether the GC is doing its job.

Fair.

Gotcha, thanks for the clarification! Although I don't quite see how to avoid creating new Mats with some of OpenCV's functions, like cvtColor() or inRange(). Those take a source and destination Mat, and since the source Mat isn't modified, it must be creating a new Mat for the destination, right? Is there a way around this? If not, I don't see how to avoid leaking memory.

Assuming that you've allocated the destination mat, it will simply write to the existing destination mat object without creating a new one. It's like when you pass an int array to a function that modifies it, the modification is visible to the caller even after the callee has returned, because they're both looking at the same object, not copies of it.

Side note - I'm not sure "leak" is the right word to use here. As I understand it, memory leaks occur in Java when the GC can't remove unused objects from memory (source). In this case, the extra Mats are being cleaned up by the GC, so it's not really a "leak".

shrug. On some devices the GC doesn't run (presumably because it doesn't know the tiny Java objects have a huge native heap contribution) so the GC just sits there and does nothing while you run out of memory.

FromenActual commented 1 year ago

it will simply write to the existing destination mat object without creating a new one.

Ah, got it! I probably should have been able to figure that out 😛

On some devices the GC doesn't run

Woah, really? Good to know!

Alrighty, I think that's all I have to say on this. Thank you for taking the time to help me understand this, and I look forward to seeing the telemetry warning message get updated!

Windwoes commented 1 year ago

Yep, I'll make sure it gets into the next release.

Windwoes commented 1 year ago

I've been volunteering as an FTA this season, and have seen this warning from multiple teams using EasyOpenCV; that indicates to me that teams don't understand the warning and how to fix it.

I FTA'd at an event yesterday and had multiple teams with the warning who also didn't know what it meant or how to fix it so yeah that assessment seems accurate to me.

Windwoes commented 1 year ago

Fixed in 1.6.0