Closed StefanBossbaly closed 2 years ago
As i know, ffmpeg will destroy umd device and reallocate new one when resizing happen. Do you check if ffmpeg save the reference list or clear the reference list when non-key frame resizing? I suspose the issue maybe caused by reference list losing. For non-key frame resizing, ffmpeg needs to save and restore reference list after reallocating new umd device.
It is possiable to reproduce with sample decoder?
As i know, ffmpeg will destroy umd device and reallocate new one when resizing happen. Do you check if ffmpeg save the reference list or clear the reference list when non-key frame resizing? I suspose the issue maybe caused by reference list losing. For non-key frame resizing, ffmpeg needs to save and restore reference list after reallocating new umd device.
I'm not too familiar with how ffmpeg works under the hood but I know in the Fuchsia decoder we keep the reference frames from the other dimensions that were created under a different context. From my understanding, VASurfaces don't have to be bound to a VAContext. So having multiple VASurfaces of different resolutions existing in the VADecPictureParameterBufferVP9::reference_frames
array is a valid use of the API. It's also my understanding that those surfaces can exist prior to the creation of the current context, the only condition is that the surfaces be destroyed after the context. When the Fuchsia decoder creates a new context via vaCreateContext
with the resolution change and then goes to render a picture with vaBeginPicture
, vaRenderPicture
and vaEndPicture
the call to vaSyncSurface
on those operations returns VA_STATUS_ERROR_DECODING_ERROR
. Calling vaQuerySurfaceError
returns the following information ...
surface = 0x0000000c
error_status = 0x00000017
status = 2
start_mb = 0
end_mb = 0
Surface 0xc was created with the new resolution change so it should be proper size to hold that image. I have attached the libva trace log from the Fuchsia device in case you want to verify.
libva_trace.log.170210.thd-0x00014693.log
It is possiable to reproduce with sample decoder?
Where is the sample decoder? I can give it a try.
It is possiable to reproduce with sample decoder?
Where is the sample decoder? I can give it a try.
Get it from https://github.com/Intel-Media-SDK/MediaSDK
@StefanBossbaly you can try https://patchwork.ffmpeg.org/project/ffmpeg/list/?series=7245. This patchset avoid to re-create vaContext when resolution change. If re-create vaContext, media-driver will clean up all decode data which may be used in decoding next frames.
@Jexu Ok I will give it a shot and see what happens.
@feiwan1 After a couple of test streams I can no longer reproduce the original issue and it seems like that fix worked. I will do more robust testing tomorrow to verify. It seems like media-driver should not clean up the decoded data until the surfaces are destroyed, which should occur after the context(s) that are using the surface are destroyed.
From the libva docs:
Contexts and Surfaces
Context represents a "virtual" video decode, encode or video processing pipeline. Surfaces are render
targets for a given context. The data in the surfaces are not accessible to the client except if derived
image is supported and the internal data format of the surface is implementation specific.
Surfaces are provided as a hint of what surfaces will be used when the context is created through
vaCreateContext(). A surface may be used by different contexts at the same time as soon as
application can make sure the operations are synchronized between different contexts, e.g. a
surface is used as the output of a decode context and the input of a video process context.
Surfaces can only be destroyed after all contexts using these surfaces have been destroyed.
Both contexts and surfaces are identified by unique IDs and its implementation specific
internals are kept opaque to the clients
Is this something that can be fixed in media-driver or will I have to have a workaround to prevent this issue from happening in the Fuchsia decoder? Feel free to correct me if anything I said is wrong. Thanks again for the quick response!
Auto Created VSMGWL-55860 for further analysis.
@Jexu Tried it out with the sample decoder and got the correct output.
I had to convert the file from WebM to the IVF container format since that is the format that the sample decoder accepts for VP9.
ffmpeg -i crowd_run_1080X512_fr30_bd8_frm_resize_l3.webm -vcodec copy -an -f ivf crowd_run_1080X512_fr30_bd8_frm_resize_l3.webm.ivf
Then I ran the sample decoder making sure to use VA-API surfaces with the hardware.
$ ./sample_decode vp9 -d -hw -p vp9d_hw -device /dev/dri/renderD128 -vaapi -i crowd_run_1080X512_fr30_bd8_frm_resize_l3.webm.ivf -o ouput.yuv -i420
libva info: VA-API version 1.15.0
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so
libva info: Found init function __vaDriverInit_1_14
libva info: va_openDriver() returns 0
plugin_loader.h :185 [INFO] Plugin was loaded from GUID: { 0xa9, 0x22, 0x39, 0x4d, 0x8d, 0x87, 0x45, 0x2f, 0x87, 0x8c, 0x51, 0xf2, 0xfc, 0x9b, 0x41, 0x31 } (Intel (R) Media SDK HW plugin for VP9 DECODE)
pretending that stream is 30fps one
Decoding Sample Version 8.4.27.0
Input video VP9
Output format I420(YUV)
Input:
Resolution 1088x512
Crop X,Y,W,H 0,0,1080,512
Output:
Resolution 1080x512
Frame rate 30.00
Memory type vaapi
MediaSDK impl hw
MediaSDK version 1.35
Decoding started
Frame number: 302, fps: 269.600, fread_fps: 0.000, fwrite_fps: 283.019
Decoding finished
plugin_loader.h :211 [INFO] MFXBaseUSER_UnLoad(session=0x0x55f2341c6690), sts=0
And then verified the md5 hash of the YUV file.
md5sum ouput.yuv
Which yields the expected value of 51b3393fa98ad9ab99c0b45ef705ebc4
libva_trace.log.134106.thd-0x0000650a.log libva_trace.log.134106.thd-0x00006507.log libva_trace.log.134106.thd-0x00006508.log libva_trace.log.134106.thd-0x00006509.log
Can confirm that the sample decoder never destroys any of the surfaces or the context until the end of the stream. The sample decoder set the current frame_width
and frame_height
on the VAPictureParameterBufferVP9
structure. So this looks like a bug related to the destruction of the context with existing surfaces during the middle of a stream.
These conformance streams always start with the larger resolution which means that we can always use the existing surfaces since they will be large enough to hold the smaller picture. I modified one of the streams to start with the smaller resolution and then try to switch to the larger resolution midway through the stream to see how the sample decoder would handle that case. There is a keyframe 2 seconds into the stream with the lower resolution so I cut to that point.
ffmpeg -i crowd_run_1080X512_fr30_bd8_frm_resize_l3.webm -ss 2 -vcodec copy -an -f ivf crowd_run_1080X512_fr30_bd8_frm_resize_l3_skip.webm.ivf
./sample_decode vp9 -d -hw -p vp9d_hw -device /dev/dri/renderD128 -vaapi -i crowd_run_1080X512_fr30_bd8_frm_resize_l3_skip.webm.ivf -o ouput.yuv -i420
libva_trace.log.140127.thd-0x00006fe3.log libva_trace.log.140127.thd-0x00006fe4.log libva_trace.log.140127.thd-0x00006fe5.log
It looks like the sample decoder destroys the surfaces when the larger resolution is encountered but does not destroy the context. So again pointing to an issue with the destruction of the context with existing surfaces.
It is api designed bahavior that when app tries to destroy the ctx, all resources bond with it will also be released. In term of VP9 DRC case, app can re-allocate surface when resolution changes or just like sample decode, re-allocate surface when resolution changes from smaller one to larger one which can avoid frequent allocation. Anyway, it is not required to re-allocate the whole ctx, unless app can ensure no reference for next frame.
Since ffmpeg patch could solve your issue and I will close this one. Feel free to re-open it if having any other concerns.
Which component impacted?
Decode
Is it regression? Good in old configuration?
No, this issue exist a long time
What happened?
The problem was observed on Fuchsia and then confirmed on Linux (via ffmpeg). When playing back the frm_resize WebM conformance streams. The stream plays back normally up until the first resolution resize. Once the resize happens the output becomes blocks of varying colors. Once a keyframe is encountered the stream recovers until another resize happens and then the output again becomes blocks of varying colors. Confirmed that frame resizing works when resizing on a keyframe. Since keyframes will clear our the reference frames I suspect that it is a problem with having reference frames of different dimensions.
To reproduce using ffmpeg using the following commands:
1)
ffmpeg -hwaccel vaapi -init_hw_device vaapi=hw:/dev/dri/renderD128 -filter_hw_device hw -v verbose -c:v vp9 -i crowd_run_1080X512_fr30_bd8_frm_resize_l3.webm -pix_fmt yuv420p -f rawvideo -vsync passthrough -y crowd_run_1080X512_fr30_bd8_frm_resize_l3.yuv
2)mpv --demuxer=rawvideo --demuxer-rawvideo-w=1080 --demuxer-rawvideo-h=512 --demuxer-rawvideo-format=I420 crowd_run_1080X512_fr30_bd8_frm_resize_l3.yuv
To verify that the md5 hash does not match the WebM truth value you need to add
-autoscale 0
since ffmpeg will scale the output to whatever the starting resolution was.1)
ffmpeg -hwaccel vaapi -init_hw_device vaapi=hw:/dev/dri/renderD128 -filter_hw_device hw -v verbose -c:v vp9 -i crowd_run_1080X512_fr30_bd8_frm_resize_l3.webm -pix_fmt yuv420p -autoscale 0 -f rawvideo -vsync passthrough -y crowd_run_1080X512_fr30_bd8_frm_resize_l3.yuv
2)md5sum crowd_run_1080X512_fr30_bd8_frm_resize_l3.yuv
Which yields the output
45b0fbf95bc023c849ecb9fd91367061
, not the expected output of51b3393fa98ad9ab99c0b45ef705ebc4
. The outputted md5 hash seems to change between runs.Using the software codec
libvpx-vp9
gives the proper output. To verify run the following:1)
ffmpeg -v verbose -c:v libvpx-vp9 -i crowd_run_1080X512_fr30_bd8_frm_resize_l3.webm -pix_fmt yuv420p -f rawvideo -y crowd_run_1080X512_fr30_bd8_frm_resize_l3.yuv
2)mpv --demuxer=rawvideo --demuxer-rawvideo-w=1080 --demuxer-rawvideo-h=512 --demuxer-rawvideo-format=I420 crowd_run_1080X512_fr30_bd8_frm_resize_l3.yuv
To verify that the md5 hash does match the WebM truth value you add
-autoscale 0
:1)
ffmpeg -v verbose -c:v libvpx-vp9 -i crowd_run_1080X512_fr30_bd8_frm_resize_l3.webm -pix_fmt yuv420p -autoscale 0 -f rawvideo -y crowd_run_1080X512_fr30_bd8_frm_resize_l3.yuv
2)md5sum crowd_run_1080X512_fr30_bd8_frm_resize_l3.yuv
Which yields the expected output of
51b3393fa98ad9ab99c0b45ef705ebc4
.The VP9 standard does allow for different decoded frames to have different sizes, which some caveats, specially:
What's the usage scenario when you are seeing the problem?
Playback
What impacted?
No response
Debug Information
1) What's libva/libva-utils/gmmlib/media-driver version? VA-API version 1.14.0 Intel iHD driver for Intel(R) Gen Graphics - 22.4.2
2) Output of
ls /dev/dri
3) Output of
vainfo >vainfo.log 2>&1
: vainfo.log4) Could you provide libva trace log? Run cmd
export LIBVA_TRACE=/tmp/libva_trace.log
first then execute the case.Sorry about the messy logs. ffmpeg has a
-threads
parameter but it doesn't seem to limit the amount of threads when using-threads 1
for hwaccelerated playback. If you know of a way to get them all collapsed in the same log let me know. Also I had to append a.log
to get Github to accept the upload.libva_trace.log.164146.thd-0x00088bcf.log libva_trace.log.164146.thd-0x00088bd0.log libva_trace.log.164146.thd-0x00088bd1.log libva_trace.log.164146.thd-0x00088bd2.log libva_trace.log.164146.thd-0x00088bd3.log libva_trace.log.164146.thd-0x00088bd4.log libva_trace.log.164146.thd-0x00088bd5.log libva_trace.log.164146.thd-0x00088bd6.log libva_trace.log.164146.thd-0x00088bd7.log libva_trace.log.164146.thd-0x00088bd8.log
Do you want to contribute a patch to fix the issue?
No response