intel / libyami

Yet Another Media Infrastructure. it is core part of media codec with hardware acceleration, it is yummy to your video experience on Linux like platform.
Apache License 2.0
146 stars 106 forks source link

vp9 decoding: when using the External SurfaceAllocator API, seeing a memleak on a specific stream. #856

Closed angelo-p closed 6 years ago

angelo-p commented 6 years ago

I uploaded the file to my drobbox account. you can access at: https://www.dropbox.com/s/cimapj5d0z66kz0/feelings_vp9-640x360.webm?dl=0

debug code shows that for this specific stream, once in a while a surface get requested through the surface_do_get_callback function and it never get released. Please help in debug this issue.

xuguangxin commented 6 years ago

Could you check it using yamidecode? Will it have same issue? Actually, decoder use same api for external or internal allocator https://github.com/intel/libyami/blob/apache/decoder/vaapidecoder_base.cpp#L226

angelo-p commented 6 years ago

yamidecode doesn't show the issue but it is not using the External Surface Allocator.

This is the sequence of calls I see when the problem occurs: ==> surface_do_allocate_callback() : takes from my external mpool a va_surface (0x4000000) ==> surface_do_put_callback(): releases to my external mpool the va_surface (0x4000000) ==> decodeGetOutput(): return a VideoFrame *ovf with ovf->surface == 0x4000000 This is causing a problem because the ovf->surface returned by decodeGetOutput() has already been released by the surface_do_put_callback() call.

I traced down the invocation of the problematic surface_do_put_callback() to the following line of code: vaapidecoder_vp9.cpp:140 ==> m_reference.clear();

139 if (hdr->frame_type == VP9_KEY_FRAME) { 140 m_reference.clear(); 141 m_reference.resize(VP9_REF_FRAMES); 142 } else {
143 FILL_REFERENCE(last_ref_frame, VP9_LAST_FRAME); 144 FILL_REFERENCE(golden_ref_frame, VP9_GOLDEN_FRAME); 145 FILL_REFERENCE(alt_ref_frame, VP9_ALTREF_FRAME); 146 }

Please help analysing the problem further. Thanks, Angelo

xuguangxin commented 6 years ago

It uses internal allocator, but the API is same as the external allocator, it's no different, you can copy the allocator code and set it to the decoder. I am little confused by your call sequence, did you see the get surface operation before we put it

angelo-p commented 6 years ago

Yes, my surface_do_allocate_callback maps to YamiStatus (getSurface)(SurfaceAllocParams thiz, intptr_t surface); and my surface_do_put_callback maps to YamiStatus (putSurface)(SurfaceAllocParams* thiz, intptr_t surface); The problem is that putSurface is called and the following decodeGetOutput call returns a surface that was just released. I see this sequence of calls only when using the vp9 codec. gdb shows that when this happen, the putSurface call is triggered by the code at vaapidecoder_vp9.cpp:140 ==> m_reference.clear();

xuguangxin commented 6 years ago

the getOuput will return a SharedPtr, it will hold a reference for the frame. So decoder will not release the frame. Could you share your code? So I can help you review and debug it thanks

angelo-p commented 6 years ago

@xuguangxin I am saying that when decodeGetOutput() is called, it is returning a VideoFrame *ovf with a ovf->surface that has already been released.

find attached the my code yami_codec_interface.c.txt

xuguangxin commented 6 years ago

@angelo-p , sorry, your code is a little complex, And I can't find a way to compile it. I try to modify current yami code to reproduce your issue. but failed. If you tried this patch, it will print something if yami output something freed. But it did not. checkoutput.zip Could you try it? I will try to implement an external allocator tomorrow if this not easier your concern. :)

thanks

angelo-p commented 6 years ago

Hi Thanks again for looking into this. I tried your patch (I added a '\n' at the end of the path debug printf()). Here is the console output when I play the vp9 stream:

23.835[5] testplayer: Multimedia graph: 23.836[5] testplayer: mp4_reader0:EncodedVideoOut -> YamiVideoDecoder0:YamiVideoIn 23.838[5] testplayer: YamiVideoDecoder0:YamiVideoOut -> Queue0:QueueIn 23.841[5] testplayer: Queue0:QueueOut -> VideoCard0:ScreenWriterInput-0 23.842[5] testplayer: End of graph. 23.861[3] apply_file_storage() MetadataStoragePath not set 23.862[6] Created router MMF router for track /amedia/drive/vp9/feelings_vp9-640x360.mp4 23.863[6] Engine 'Single-track engine' created for track /amedia/drive/vp9/feelings_vp9-640x360.mp4 23.864[5] mmf routing: have new metadata from mp4_reader0 23.865[5] mp4_reader: EncodedVideoOut: NextBuffer queue empty depth 185 nbytes 0 WAITING 23.867[5] yami_video_decoder:HandleListenerEvent:1928:seeing MM_RESOURCES_RETURNED 23.868[3] yami_video_decoder:acquire_managed_resources:475:Already had decoder resources. 23.871[5] yami_video_decoder:ProcessInputMra:1524:seeing DECODER_CONFIG count 1 payload 0x8a7b000 size 10 0x0 0x0 23.873[5] yami_codec_interface:surface_do_allocate_callback:295:asking for 16 (8) got 22 640x384 surfaces fourcc='NV12' generic=0 23.877[5] yami_codec_interface:surface_do_allocate_callback:385:[0] the va_sid=0x4000000 is associated with the mmf egl=0x20 23.879[5] yami_codec_interface:surface_do_allocate_callback:385:[1] the va_sid=0x4000001 is associated with the mmf egl=0x21 23.881[5] yami_codec_interface:surface_do_allocate_callback:385:[2] the va_sid=0x4000002 is associated with the mmf egl=0x22 23.884[5] yami_codec_interface:surface_do_allocate_callback:385:[3] the va_sid=0x4000003 is associated with the mmf egl=0x23 23.886[5] yami_codec_interface:surface_do_allocate_callback:385:[4] the va_sid=0x4000004 is associated with the mmf egl=0x24 23.888[5] yami_codec_interface:surface_do_allocate_callback:385:[5] the va_sid=0x4000005 is associated with the mmf egl=0x25 23.891[5] yami_codec_interface:surface_do_allocate_callback:385:[6] the va_sid=0x4000006 is associated with the mmf egl=0x26 23.893[5] yami_codec_interface:surface_do_allocate_callback:385:[7] the va_sid=0x4000007 is associated with the mmf egl=0x27 23.897[5] yami_codec_interface:surface_do_allocate_callback:385:[8] the va_sid=0x4000008 is associated with the mmf egl=0x28 23.899[5] yami_codec_interface:surface_do_allocate_callback:385:[9] the va_sid=0x4000009 is associated with the mmf egl=0x29 23.900[5] yami_codec_interface:surface_do_allocate_callback:385:[10] the va_sid=0x400000a is associated with the mmf egl=0x2a 23.903[5] yami_codec_interface:surface_do_allocate_callback:385:[11] the va_sid=0x400000b is associated with the mmf egl=0x2b 23.905[5] yami_codec_interface:surface_do_allocate_callback:385:[12] the va_sid=0x400000c is associated with the mmf egl=0x2c 23.907[5] yami_codec_interface:surface_do_allocate_callback:385:[13] the va_sid=0x400000d is associated with the mmf egl=0x2d 23.909[5] yami_codec_interface:surface_do_allocate_callback:385:[14] the va_sid=0x400000e is associated with the mmf egl=0x2e 23.911[5] yami_codec_interface:surface_do_allocate_callback:385:[15] the va_sid=0x400000f is associated with the mmf egl=0x2f 23.913[5] yami_codec_interface:surface_do_allocate_callback:385:[16] the va_sid=0x4000010 is associated with the mmf egl=0x30 23.916[5] yami_codec_interface:surface_do_allocate_callback:385:[17] the va_sid=0x4000011 is associated with the mmf egl=0x31 23.918[5] yami_codec_interface:surface_do_allocate_callback:385:[18] the va_sid=0x4000012 is associated with the mmf egl=0x32 23.919[5] yami_codec_interface:surface_do_allocate_callback:385:[19] the va_sid=0x4000013 is associated with the mmf egl=0x33 23.922[5] yami_codec_interface:surface_do_allocate_callback:385:[20] the va_sid=0x4000014 is associated with the mmf egl=0x34 23.924[5] yami_codec_interface:surface_do_allocate_callback:385:[21] the va_sid=0x4000015 is associated with the mmf egl=0x35 output a freed surface, id = 4000000 output a freed surface, id = 4000001 23.951[3] yami_codec_interface:pop_sorted_pts:587:Concealing a missing PTS returning pts=66ms domain=0: last_pts=33ms + frame_time=33ms 23.951[5] Content availability event: 0x2 -> 0x2 output a freed surface, id = 4000002 output a freed surface, id = 4000003 23.970[5] EncodedVideoOut:event_onpush Posting an HWM_RISING event queue depth 185 count 38/1075146bytes hwmark 1048576bytes 23.971[6] MMF router: clock event (output started) output a freed surface, id = 4000004 output a freed surface, id = 4000005 output a freed surface, id = 4000000 output a freed surface, id = 4000001 output a freed surface, id = 4000002 output a freed surface, id = 4000003 24.130[5] Stopping prefetch 24.130[5] mp4_reader-->Prefetching threshold = 2097152, queue size = 5259498, q max size = 5242880 24.131[5] mp4_reader: Waiting for transition to a playing state. state = 0x100. eos = 0x0 output a freed surface, id = 4000004 output a freed surface, id = 4000005 output a freed surface, id = 4000000 output a freed surface, id = 4000001 output a freed surface, id = 4000006 output a freed surface, id = 4000003 output a freed surface, id = 4000004 output a freed surface, id = 4000005 output a freed surface, id = 4000000 output a freed surface, id = 4000001 output a freed surface, id = 4000006 output a freed surface, id = 4000003 output a freed surface, id = 4000002 output a freed surface, id = 4000005 output a freed surface, id = 4000000 output a freed surface, id = 4000001 output a freed surface, id = 4000006 output a freed surface, id = 4000003 output a freed surface, id = 4000002 output a freed surface, id = 4000005 output a freed surface, id = 4000000 output a freed surface, id = 4000001 output a freed surface, id = 4000004 output a freed surface, id = 4000003 output a freed surface, id = 4000002 output a freed surface, id = 4000005 output a freed surface, id = 4000000 output a freed surface, id = 4000001 output a freed surface, id = 4000004 output a freed surface, id = 4000003 output a freed surface, id = 4000002 output a freed surface, id = 4000006 output a freed surface, id = 4000000 output a freed surface, id = 4000001 output a freed surface, id = 4000004 output a freed surface, id = 4000003 output a freed surface, id = 4000002 output a freed surface, id = 4000006 output a freed surface, id = 4000000 output a freed surface, id = 4000001 output a freed surface, id = 4000005 output a freed surface, id = 4000003 output a freed surface, id = 4000002 output a freed surface, id = 4000006 output a freed surface, id = 4000000 output a freed surface, id = 4000001 output a freed surface, id = 4000005 output a freed surface, id = 4000003 output a freed surface, id = 4000002 output a freed surface, id = 4000006 output a freed surface, id = 4000000 output a freed surface, id = 4000001 output a freed surface, id = 4000005 output a freed surface, id = 4000004 output a freed surface, id = 4000002 output a freed surface, id = 4000006 output a freed surface, id = 4000000 output a freed surface, id = 4000001 output a freed surface, id = 4000005 output a freed surface, id = 4000004 output a freed surface, id = 4000002 output a freed surface, id = 4000006 output a freed surface, id = 4000000 output a freed surface, id = 4000001 output a freed surface, id = 4000005 output a freed surface, id = 4000004 output a freed surface, id = 4000003 output a freed surface, id = 4000006 output a freed surface, id = 4000000 output a freed surface, id = 4000001 output a freed surface, id = 4000005 output a freed surface, id = 4000004 output a freed surface, id = 4000003 output a freed surface, id = 4000006 output a freed surface, id = 4000000 output a freed surface, id = 4000001 output a freed surface, id = 4000005 output a freed surface, id = 4000004 output a freed surface, id = 4000003 output a freed surface, id = 4000002 output a freed surface, id = 4000000 output a freed surface, id = 4000001 output a freed surface, id = 4000005 output a freed surface, id = 4000004 output a freed surface, id = 4000003 output a freed surface, id = 4000002 output a freed surface, id = 4000000 output a freed surface, id = 4000001 output a freed surface, id = 4000006 output a freed surface, id = 4000004 output a freed surface, id = 4000003 output a freed surface, id = 4000002 output a freed surface, id = 4000000 output a freed surface, id = 4000001 output a freed surface, id = 4000006 output a freed surface, id = 4000004 output a freed surface, id = 4000003 output a freed surface, id = 4000002 output a freed surface, id = 4000000 output a freed surface, id = 4000001 output a freed surface, id = 4000006 output a freed surface, id = 4000005 output a freed surface, id = 4000003 output a freed surface, id = 4000002 output a freed surface, id = 4000000 output a freed surface, id = 4000001 output a freed surface, id = 4000006 output a freed surface, id = 4000005 output a freed surface, id = 4000003 output a freed surface, id = 4000002 output a freed surface, id = 4000000 output a freed surface, id = 4000001 output a freed surface, id = 4000006 output a freed surface, id = 4000005 output a freed surface, id = 4000004 output a freed surface, id = 4000002 output a freed surface, id = 4000000 output a freed surface, id = 4000001 output a freed surface, id = 4000006 output a freed surface, id = 4000005 output a freed surface, id = 4000004 output a freed surface, id = 4000002 output a freed surface, id = 4000000 output a freed surface, id = 4000001 output a freed surface, id = 4000006 output a freed surface, id = 4000005 output a freed surface, id = 4000004 output a freed surface, id = 4000002 output a freed surface, id = 4000000 output a freed surface, id = 4000001 output a freed surface, id = 4000006 output a freed surface, id = 4000005 output a freed surface, id = 4000004 output a freed surface, id = 4000002 output a freed surface, id = 4000000 output a freed surface, id = 4000001 output a freed surface, id = 4000006 output a freed surface, id = 4000005 output a freed surface, id = 4000004 output a freed surface, id = 4000002

angelo-p commented 6 years ago

@xuguangxin, Let me know if there is anything I can do to help in investigating this issue. Thanks, Angelo

xuguangxin commented 6 years ago

@angelo-p , Did you use different threads to do decode and getoutput?

I will implement a simplified external allocator in yamidecode. ETA is 1~2days later. If I can't reproduce the issue on this allocator. I may ask you help reproduce it on the simplified allocator.

thanks

xuguangxin commented 6 years ago

@angelo-p ,I implemented a external allocator, I still can't reproduce your issue. Could you try the last patch in https://github.com/xuguangxin/libyami-utils/commits/ext_alloc. Try to reproduce your issue on top of this patch?

angelo-p commented 6 years ago

@xuguangxin, our application calls decode and getouput from a single thread. I cannot reproduce the issue using yamidecode and your latest patch, but I should mention that your patch has a C++ allocator implementation and our app is using the Capi interface exposed in the VideoDecoderCapi.h header file when interacting with libyami.

xuguangxin commented 6 years ago

Did you call free in other thread? If yes, you may need to do some lock like I do.

Let me convert my example to CAPI. But it may take some time. The ETA maybe next Monday. @angelo-p

angelo-p commented 6 years ago

No, free on the output VideoFrame is called from the same thread as decode and getouput. As soon as getouput. returns a VideoFrame* (ovf), we process it and call ovf->free(ovf) right away on it.

xuguangxin commented 6 years ago

@angelo-p , this is capi version. you can use "yamidecode -i xxx.vp9 -m -1 --capi" to test it. The implementation is c++ based because the lock and map/set things in c is boring. If you are interested, you can convert it to pure c version.

But the API is C based, no c++ magic. if possible, could you reproduce the issue on it ?

thanks

https://github.com/xuguangxin/libyami-utils/tree/eac https://github.com/xuguangxin/libyami/tree/eac

angelo-p commented 6 years ago

Thanks @xuguangxin, I'll run the test with your app.

angelo-p commented 6 years ago

@xuguangxin, With your test app, I was able to root-cause an allocator problem in our code. Thanks again for your help.

xuguangxin commented 6 years ago

glad to hear this. :)