heisters / libglvideo

Simple cross-platform MP4/MOV video player for use with OpenGL
Other
28 stars 5 forks source link

Small memory leak #6

Closed mattfelsen closed 5 years ago

mattfelsen commented 5 years ago

Another thing we've been seeing while stress testing our apps overnight (with randomly generated touch input) is a small but steady memory leak, apparently coming from Movie::bufferNextGPUSample() and apparently something small and variable, ie 40, 48, or 1712 bytes, and definitely not the frame size (i.e. ~2MB for HD). To give an idea of the scale, the app launches with under 1GB of mem usage and after running overnight it's over 10GB.

I looked around in Movie, Frame, and FrameTexture and didn't see anything obvious, but if you've got any leads or pointers I'm sure that'd be helpful. Thanks again!

heisters commented 5 years ago

Awesome! Thanks for finding a memory leak!

It makes sense that it would happen in bufferNextGPUSample()... Take a look at the method bufferNextGPUSample() calls: /src/Frame.cpp#L24. I think that and the decoders are the only places manual memory management is happening. So also take a look at the Hap decoder: /src/decoders/hap.cpp. It could be some error in the calculated texture size or something? I've never been pleased with how hap.cpp basically guesses at the texture size, and then tries making the buffer larger until it doesn't get an error /src/decoders/hap.cpp#L100.

heisters commented 5 years ago

Does the memory leak repro on any of the sample apps? Or can you write a small sample app that reproduces it?

mattfelsen commented 5 years ago

Ooh, good question. I'll run the sample apps and take a look into the decoder as well and report back 👍

mattfelsen commented 5 years ago

I took a look with the stress sample app and am seeing what seems to be the same leak. I'll poke around a bit more, and perhaps give a shot at updating the hap lib/submodule to the latest, and share any updates

heisters commented 5 years ago

ok, great. I'll try to take a look at it myself too.

mattfelsen commented 5 years ago

Hm, no luck upgrading hap to master (since there are no releases) nor Bento4 to v1.5.1-628 (the latest release).

Re: memory management, in Decoder::Hap, m_decompressedTextureBuffer seems to be the only data buffer, and it's reused for each new frame of the movie. Doesn't seem to be the source of any leaks since it's only instantiated once per movie. Related, I never actually see that "buffer too small" warning get hit.

In Frame, the m_texData buffer is wrapped in a unique_ptr<unsigned char[]>, so it should get cleaned up when the frame is destructed. I also tried replacing it with a vector<unsigned char> (and replacing calls to .get() with .data()), but no dice either.

I don't actually see any memory allocations in Frame::bufferTexture() like you mention, just a copy into the pbo.

🤔

mattfelsen commented 5 years ago

Hey Ian, have you by chance had a moment to take a look here? We've tried a few things, including taking a rough stab at using a texture pool instead of a new Frame/FrameTexture for each frame, but that didn't seem to help. https://github.com/claytercek/libglvideo/commit/fb3adeb99ce8541142daea8dbb646b40d8edfc18

Here are some screenshots from different memory snapshots in VS (stacks view & types view). Not a ton of insight here, but one thing that's curious is in the stacks view (first image) is shows a size diff of +2MB, but in the types view, there is only a ~150KB increase in mem shown. Where did the rest of it go??

stacks-view

types-view

Any ideas on additional things to try or look into?

heisters commented 5 years ago

Sorry, I haven't had a chance to look at it yet. I've blocked out some time later this week.

heisters commented 5 years ago

I don't think 33a7b36 should fix the leak, but please see if it changes anything for you.

It may have something to do with std::deque (the underlying type for concurrent_queue) not releasing memory until program exit. I wonder if this may even cause the shared_ptr (Frame::ref) to stick around? I'd think the memory leak would be a lot worse if that were the case. In any case, one option may be to rewrite concurrent_queue to use a custom buffer data type or maybe std::array.

claytercek commented 5 years ago

@heisters thanks for looking into this! I did some digging around and found that the glFenceSync function in Frame.cpp seems to be what's eating up our memory. I came up with a quick fix for this by calling glDeleteSync in the Frame destructor. Right now this seems to be working very well! No longer seeing any signs of a memory leak.

As far as I can tell this fix doesn't seem to have any adverse side effects, but also I am not 100% sure what role m_sync plays. Curious to hear your thoughts on this. Thanks!

heisters commented 5 years ago

Good catch! Forgot GLSync is actually a typedef of __GLSync * and needs to be cleaned up manually.

The fence sync is to asynchronously check whether the PBO is finished buffering to the GPU. So this caught another potential bug where Movie would update the current texture before the texture was done buffering. In practice, this shouldn't happen unless your buffer sizes are set small, and/or GPU bandwidth is saturated. Nevertheless, I added a check to catch those cases, and make that fence sync actually useful.

Fixes are on master b60cc26b

heisters commented 5 years ago

One other note, about @claytercek's texture pool patch. That's a good improvement, and something I've had in mind to do. If you can test and make sure it works well, I'd be happy to take a PR.