jc-kynesim / rpi-ffmpeg

FFmpeg work for RPI
Other
111 stars 27 forks source link

v4l2_req_media: race condition between destroying `media_pool` and returning `media_request` to it #78

Open peat-psuwit opened 10 months ago

peat-psuwit commented 10 months ago

It seems to be possible that media_pool is destroyed before all media_request are returned to it. When a remaining media_request is returned with media_request_done(), the use-after-free occurs on media_pool, and that media_request is also leaked since no one will free it.

This is the output from Valgrind which lead to this discovery. The exact commit used is 61733f14a6c ("v4l2: Add (more) RGB formats to DRM & V4L2") on branch dev/6.0/rpi_import_1: https://paste.ubuntu.com/p/GBwwSPVrbp/

Inspired by how Gstreamer do it, I think the solution is to make media_pool ref-counted, then make media_request_get() increase the refcount and media_request_done() decrease it. This makes sure that media_pool outlive the media_request.

jc-kynesim commented 9 months ago

Thanks - I'll look into that. I do run valgrind over that code code but obviously my test setup missed that.

jc-kynesim commented 9 months ago

I think I see the problem - it is, I think, a locking fail on delete where the req chain can become broken, but this doesn't happen for me by default - can you give me whatever stream you are using to test with including the appropriate command line so I can check that (a) I can reproduce the fail and (b) I've fixed it. Ta

peat-psuwit commented 9 months ago

The command I used is:

valgrind --leak-check=full /path/to/ffmpeg -hwaccel drm -f concat -i imoutestvideo.txt -f null -

Where the imoutestvideo.txt contains 12 lines of text file 'imoutestvideo.mp4', and the imoutestvideo.mp4 is the file embedded in #75. I think the trick is the fact that this video (by itself) causes a lot of v4l2_request_hevc_uninit() and v4l2_request_hevc_init(), as discussed in that PR.


I'm not sure what you mean by "a locking fail", but if you're talking about locking in media_request_done(), then the lock would be meaningless if media_pool and its associated lock are freed by the time that function is called?