ffmpeginteropx / FFmpegInteropX

FFmpeg decoding library for Windows 10 UWP and WinUI 3 Apps
Apache License 2.0
212 stars 53 forks source link

Stream buffer implementation #270

Closed lukasf closed 2 years ago

lukasf commented 2 years ago

I have been working on this for a while, and finally I think I have a good solution.

This PR solves a lot of problems we had with streaming, where we were previously using MSS buffer (#230, #168). The MSS buffer works on decoded samples, causing heavy memory use (plus heavy GPU memory use when using new HW acceleration). Plus it was delaying playback start and seeking due to MSS filling the complete buffer before playback starts.

With this PR, the lib will now read-ahead packets up to a configurable byte size and duration limit. Because only compressed samples are buffered, memory consumption is very low and we do not have any issues with HW acceleration. When seeking forward, the lib will even check if the target position is buffered. If the position is buffered, we just skip ahead in the buffer, no file seeking needed at all! Skipping forward in a stream is impressively fast when the area is buffered, especially when fast seek is enabled.

I also needed to work on subtitles. Because they are now decoded ahead of time, I needed to delay creation of SoftwareBitmaps for image subtitles until a sub is actually shown (and discard the bitmap again when cue is out). This seems to work very well and should even improve memory consumption and performance when using external image subtitle files. One more benefit of the new solution is that subtitles are now immediately available when selecting them.

I also reworked the fast seek code. It is now even more reliable. I have been thinking if we should enable fast seek by default? It really makes seeking a lot faster for normal files, and for streams it is incredibly helpful. Currently it is disabled by default.

This will make some merge effort with the CppWinRT migration, sorry for that.

brabebhin commented 2 years ago

Cool. I will give it a try the coming days.

Regarding the cpp/winRT migration, i know this will mean some extra work but it shouldn't be that hard. Most changes are contained in ffmpeg reader and the new class, which is essentially a copy paste.

brabebhin commented 2 years ago

So I gave it a try on some of my HEVC files, and the first impression is that it indeed is much faster when seeking (fast seek and slow seek are just as fast apparently), however, CPU usage has increased significantly compared to the standard-cpp-migration branch. The same file puts my CPU in 80-90% load with this branch while the standard-cpu-branch is only 40-50.

lukasf commented 2 years ago

Yeah the last changeset was buggy, it never stopped buffering ^^ Will push an update later.

lukasf commented 2 years ago

So I pushed a new update. Should fix the infinite buffering, which probably caused the high CPU usage you saw (and it ate loads of memory). I also ran a CPU profiler, almost zero time spent in buffering code.

But during that profile run, I found out that our D3D TexturePool was not working! 5% of total CPU time was spent allocating new textures. We put the samples in a map to prevent them from being collected (Processed event does not come if the sample is not referenced). I guess initially we used a list of MediaStreamSample^, but later we changed it to IUnknown for whatever reason. IUnknown is of cource not ref counted, so adding it as IUnknown to the map did not prevent collection. So the Processed events were not raised anymore.

In my new solution I just call AddRef on IUnknown and release it in Processed. No map needed anymore.

lukasf commented 2 years ago

Without FastSeek, I regularly see a small delay after click, like up to half a second or so until playback starts. It depends on the position, how far the next key frame is (and on the video bitrate / decode speed). Often it is also nearly instant, but not always. With FastSeek enabled, it is always instant, zero delay. I have enabled it by default now in the config.

brabebhin commented 2 years ago

I also noticed something weird with the TexturePool when porting, and I added the ref in DirectXInteropHelper, and manually released the textures in the destructor of the texture pool. It seems to be fairly stable like that. I just assumed it was something related to C++/winRT

brabebhin commented 2 years ago

When enabling the GPU video effect. I get the following error upon opening a file:

WinRT originate error - 0x80004005 : 'The window has already been destroyed.'.

However, this occurs with both the master and this branch, so I suppose this is just another bug in winRT/UWP. I am starting to wonder if we should even bother continuing to support that video effect, and only keep it in as a sample somewhere in a readme file.

brabebhin commented 2 years ago

This might be counterintuitive but one of the advantages of cpp/winRT is easier async/multithreading support. Maybe we should switch this around and do it directly in cpp/winRT?

lukasf commented 2 years ago

I would rather like to get the stream buffer out soon. The cpp branch will probably still take a while. But we might be able to improve some of the synchronization there.

lukasf commented 2 years ago

I think this is also more or less finished. The subtitle bug that i observed on master branch is also fixed here. It was a problem with RefCue (broken during WinRT conversion).

I will probably set the default to disable stream buffering until I have more long term results.

brabebhin commented 2 years ago

Do you have any specific issue in mind that would like checked?

lukasf commented 2 years ago

It's more the general stability, especially when seeking around heavily. I have not seen crashes in a while but who knows... Multi threading these kind of things is not trivial, and we don't have all the nice stuff from .net available, unfortunately.

lukasf commented 2 years ago

Merged this one to master now. Read-ahead buffer is disabled by default, so there is not much risk in bringing this in.