ffmpeginteropx / FFmpegInteropX

FFmpeg decoding library for Windows 10 UWP and WinUI 3 Apps
Apache License 2.0
205 stars 52 forks source link

xbox + 4k = problems² #414

Closed softworkz closed 6 months ago

softworkz commented 6 months ago

Hi,

few days ago I started testing on the Xbox. The results so far are somewhat disappointing, unfortunately. Full HD playback is working in most cases, but when it comes to 4k playback, the results rather appear to be like with Raspberry PI than with a contemporary gaming console. I don't mean to blame anybody or anything, I really don't know what it is and it's very possible that the error is on my side.

The symptoms are primarily about stuttering/pausing of playback, which means that processing is simply too slow to playback continuously. In the worst cases, it can be like 10s pausing for 1s of playback. It appears that the more data is moved around, the worse it gets. There's a dependency on the bandwidth of the source material, but also on later stages in the playback chain. For example, a video which plays almost fluently in SDR is hanging/stuttering once the device is set to HDR mode.

The bottleneck appears to be system (CPU) memory. There's a limit of 1GB RAM for UWP apps and this always gets hit instantly when starting playback (there's a capability to extend this to 1.5 GB, but it doesn't help much.

There's also not much difference when switching between SystemDecoder and FFMpegSoftwareDecoder. In none of those cases does it appear to use a purely GPU-based video pipeline - otherwise memory usage shouldn't increase so steep.

Any ideas what it can be and what I should try? Is there some log output (like from ffmpeg) that I can acquire? Are there any options that I should set in a certain way? And finally, what does "System Decoder" mean, actually?

Thanks

brabebhin commented 6 months ago

Hi,

We never tested this on an xbox. None of us has an xbox for testing.

System decoder should be the way to go forward in xbox. It means we only do demux in our library, and the system handles the decoding. We passthrough encoded samples to the OS to use its internal codecs to do the decoding, this should be the same as using MediaSource.FromStream or Uri.

So system decoder is different than the D3D or software options in which we do both demux and decoding.

I think D3D decoder does not work on xbox. No idea why. It might be some API incompatibility.

softworkz commented 6 months ago

We passthrough encoded samples to the OS to use its internal codecs to do the decoding, this should be the same as using MediaSource.FromStream or Uri.

Where (by whom) is all the memory allocated? Maybe you are feeding too fast/much?

So system decoder is different than the D3D or software options in which we do both demux and decoding.

I think D3D decoder does not work on xbox. No idea why. It might be some API incompatibility.

Maybe it's just different decoder GUIDs. Xbox uses non-standard GPU hardware, so it might very well use some special decoder GUIDs.

softworkz commented 6 months ago

Maybe you are feeding too fast/much?

Is it a pull or a push model?

brabebhin commented 6 months ago

The model is the same as everywhere: the system pulls from us, we pull from the stream. The only implementation using a push model is subtitles. Memory is allocated by us, but remember these are compressed samples in the system decoder scenario, so it shouldn't go overboard.

This kind of memory build up usually occurs when the timestamps of the samples are "wrong", they do not come at the current playback position but at an earlier or later position. So the system keeps buffering samples instead of playing and then releasing them.

softworkz commented 6 months ago

This kind of memory build up usually occurs when the timestamps of the samples are "wrong", they do not come at the current playback position but at an earlier or later position. So the system keeps buffering samples instead of playing them.

I happens with every 4k video. Even a H.264 one. Only onee with very low bandwidth are playing (more or less).

softworkz commented 6 months ago

but remember these are compressed samples in the system decoder scenario, so it shouldn't go overboard.

Yea, such blowup rather looks like software decoding. How can I see what's actually happening with ffmpeg. Is there a log output?

brabebhin commented 6 months ago

If you can attach a debugger you can see some logs in the output window, there's also the ILogProvider interface you can use. But the scenario of us accidentally doing a software decode is unlikely. When we use System decoder option, we create VideoEncodingProperties that fit the target format (h264, h265, vp9, etc see https://learn.microsoft.com/en-us/uwp/api/windows.media.mediaproperties.videoencodingproperties?view=winrt-22621), and when we use software decoder we use nv12 format.

So if we were to feed nv12 (output of software decoder) into those h264/265/whatever formats, you'd usually get a green screen and not see any video at all.

Technically the output of the software decoder can be something other than nv12, like brga or Iyuv, those are configurable, but nv12 is the commonly supported format.

I think it is more likely the OS does the software decoding for whatever reason. I know you need some extensions installed for h264 and h265. We kind of bypass that on desktop with directx.

softworkz commented 6 months ago

I think I figured it out why I always got software decoding:

I was setting the VideoDecoderMode after FFmpegMediaSource.CreateFromUriAsync. I had set it once at an earlier point, but then I got no video, just audio, so I moved it back to afterwards.

Probably it needs to be set before. But that just replaces one problerm with another one...

softworkz commented 6 months ago

The HEVC files are playing fine with regular playback (like MediaSource.CreateFromUri(uri))

From the comparison, I could see that playback startup in the failing case is much faster, wich likely indiicaes that there's not even an attempt being to set up and initialize video decoding.

softworkz commented 6 months ago

When I set VideoDecoderMode to Automatic, I see this in the debug output:

Exception thrown at 0x00007FFAC7223EEC (KernelBase.dll) in My.Client.Uwp.exe: WinRT originate error - 0x80070057 : 'Adjusted video area is smaller than supported by format'.
Exception thrown at 0x00007FFAC7223EEC (KernelBase.dll) in My.Client.Uwp.exe: WinRT originate error - 0x80070057 : 'Adjusted video area is smaller than supported by format'.
brabebhin commented 6 months ago

I think the automatic will pick up directx decoder because the hardware technically supports the codecs, but ffmpeg then fails to decode properly.

softworkz commented 6 months ago

I made a little progress and got a few new findings. As usual, it's getting more complicated instead of easier.

H.264

@brabebhin, you were right - at least in regards to one of the files, which is a 4k H.264 file. And hw decoder on the xbox doesn't support resolutions > FullHD. This applies to both cases, ForceSystemDecoder and Automatic. In both cases it falls back to software decoding. In the latter case, ffmpeg outputs:

>>>> Debug [h264 @ 0000012AB8EF3880] Format d3d11 not usable, retrying get_format() without it.
>>>> Debug [h264 @ 0000012AB8EF3880] Format yuv420p chosen by get_format().
>>>> Info [h264 @ 0000012AB8EF3880] Reinit context to 3840x2160, pix_fmt: yuv420p

I don't think it's a technical limitation since the GPU can decode HEVC 4k, which is much more compute intensive. Probably it's about cost which they wanted to avoid (paying to AMD for GPU features). That makes sense when considering another detail: HEVC playback is locked behind a restricted capability. Without that, you cannot even switch display to HDR mode. I'm confident though, that we will get this granted. It's just a measure to avoid paying license fees for all xboxes. By tracking which apps are installed on how many boxes, they'll be able to pay only for those devices where an app is installed that needs it.

Anyway, H.264 4k is not a big deal. We'll mark this as unsupported and then the server will transode in those cases.

softworkz commented 6 months ago

HEVC

For HEVC it's getting interesting. Now that I have the logging attached, I can better see and understand what's actually happening. Altogether there are 4 ways how to playback HEVC videos:

1. FFmpegInteropX with ForceFFmpegSoftwareDecoder

This is basically working but it's way too slow, it hangs and stutters and not useful practially.

2. Windows.Media MediaSource

Without FFmpegInteropX, all cases are working well (4k yes/no, HDR yes/no).

3. FFmpegInteropX with ForceSystemDecoder

This hasn't been working so far - but more precisely: It does play in fact and no error is logged. Audio is playing and FFmpeg continuously downloads the video data. I even see some log messagess from DX11VA dlls. "Only" problem: The screen remains dark - no picture. But also no error.

Then I did a comparison of the VideoEncodingProperties from case 2 and this one (in the MediaOpened event). I noticed that the SubType was set to "HEVC" in case 2 and "HEVCES" in this case.

Elementary Streams ("ES") are a concept of MPEGTS so I wonder why it's set to HEVCES even though I never played an MPEGTS?

But then I did, and wonder what? In those cases it's working fine and the videos are visible.

=> My suspicion would be that there's something with the data it gets, which causes it to assume that it gets an elementary stream Maybe, then it's just parsing and waiting for data it can recognize. (which never happens)

4. FFmpegInteropX with Auto (FFmpeg HW Decoding)

There's no general issue like that FFmpeg couldn't do D3D11VA decoding on the Xbox. It is working successfully with an H.264 FullHD video.

I hadn't tested this "Auto" setting much before, because "Auto" might often be convenient, but for testing something, it's rarely a good choice, because "Auto" often implies that you won't know what's happening. The few times I tested it, I must have a mismatch between the active screen's HDR mode and the HdrSupport setting (Enable or Disable), because that causes the app to freeze forever and I had experienced that as well.

If the HDR settings match, then it's playing - basically. But not as good as in case 2. Here, it depends on the framerate. 24fps videos (HEVC, 4k, HDR) are playing ok, but with 60fps videos, I see hanging and stuttering. Maybe this can be improved by tweaking ffmpeg parameters, I'll try that.

I couldn't compare with 3 because I don't have am mpegts video with 60fps.

My priority would be trying to fix 3 rather than fiddling around, in tweaking 4. Even though the exact same HW decoders being used in both cases, there are differences in the way how the decoding is orchestrated (surface pool counts, surface allocation, releasing, recycling, forwarding for presentation, supplying data via staging surfaces, etc.). FFmpeg DXVA decoding is not optimized for use in a player, while MS' implementation is surely very well optimized for playback, most likely with the help or done by the manufacturer (AMD) in case of a product like xbox.

@brabebhin - you said already that ForceSystemDecoder is the way to go on xbox and I tend to agree about that.

Does any of you have an idea what it could be in case of 3?

brabebhin commented 6 months ago

Wow @softworkz thanks for the detailed breakdown.

I have checked the code, and we don't explicitly set HEVCES subtype anywhere, so this must somehow come from ffmpeg or something. The next step I can think of here is trying to reproduce this in one of the samples, maybe we a debugger attached we can better understand what's going on.

Would it be possible to share that HEVCES video?

softworkz commented 6 months ago

Just for reference, here are the available decoders on my Xbox (Xbox S) alongside the color formats being used:

{1b81bea3-a0c7-11d3-b984-00c04f2e73c5}: DXGI_FORMAT_NV12  DXVA_ModeVC1_D/DXVA_ModeVC1_VLD - VC-1 variable-length decoder
{1b81be68-a0c7-11d3-b984-00c04f2e73c5}: DXGI_FORMAT_NV12  DXVA_ModeH264_E/DXVA_ModeH264_VLD_NoFGT - H.264 variable-length decoder, no film grain technology
{ee27417f-5e28-4e65-beea-1d26b508adc9}: DXGI_FORMAT_NV12  D3D11_DECODER_PROFILE_MPEG2_VLD - MPEG-2 variable-length decoder
{efd64d74-c9e8-41d7-a5e9-e9b0e39fa319}: DXGI_FORMAT_NV12  DXVA_ModeMPEG4pt2_VLD_Simple - MPEG-4 Part 2 variable-length decoder, Simple profile
{705b9d82-76cf-49d6-b7e6-ac8872db013c}: DXGI_FORMAT_NV12  DXVA_ModeH264_VLD_Multiview_NoFGT - H.264 MVC variable-length decoder, multiview
{5b11d51b-2f4c-4452-bcc3-09f2a1160cc0}: DXGI_FORMAT_NV12  DXVA_ModeHEVC_VLD_Main - H.265 variable-length decoder, Main profile
{107af0e0-ef1a-4d19-aba8-67a163073d13}: DXGI_FORMAT_P010  DXVA_ModeHEVC_VLD_Main10 - H.265 variable-length decoder, Main 10 profile
{463707f8-a1d0-4585-876d-83aa6d60b89e}: DXGI_FORMAT_NV12  DXVA_ModeVP9_VLD_Profile0
{a4c749ef-6ecf-48aa-8448-50a7a1165ff7}: DXGI_FORMAT_P010  DXVA_ModeVP9_VLD_10bit_Profile2
softworkz commented 6 months ago

I have checked the code, and we don't explicitly set HEVCES subtype anywhere, so this must somehow come from ffmpeg or something.

Yes, I checked that as well. This seems to be set by the Windows.Media framework. FFmpegInteropX also isn't setting any properties (https://learn.microsoft.com/en-us/uwp/api/windows.media.mediaproperties.videoencodingproperties.properties), which are populated like this:

image

Would it be possible to share that HEVCES video?

This happens for all videos I tried, which were mp4 and mkv. (also for .ts - but for those it's correct)

I'll try on desktop Windows now, to see how that compares.

lukasf commented 6 months ago

Please try with branch from fix-hevc-passthough. It should fix playback on XBOX.

The HEVCES subtype seems to be created when we use VideoEncodingPropertiex.CreateHevc(). But it does not cause problems once the NAL packet parsing bug is fixed. Works all fine now for me.

softworkz commented 6 months ago

Please try with branch from fix-hevc-passthough. It should fix playback on XBOX.

The HEVCES subtype seems to be created when we use VideoEncodingPropertiex.CreateHevc(). But it does not cause problems once the NAL packet parsing bug is fixed. Works all fine now for me.

Excellent - that was it. Thank you very much!

lukasf commented 6 months ago

Closing this after merge of #416 and #417