CrendKing / avisynth_filter

DirectShow filters that put AviSynth and VapourSynth into video playing
MIT License
107 stars 8 forks source link

d3d11 direct-mode breaks when remote control is enabled #72

Closed sofakng closed 2 years ago

sofakng commented 2 years ago

This is regarding issue #63.

@CrendKing It looks like when I [Enable Remote Control] in AviSynthFilter it stops using direct mode. It doesn't matter if any scripts are running; simply enabling the remote control breaks it.

Here are my logs: avisynth_filter_no-remote.log (d3d11 cb direct) avisynth_filter-remote.log (d3d11 cb)

Strangely, the log file when remote control is enabled is quite large compared to the non-remote control?

The reason this is a big deal is because when direct mode is enabled, my 6K-8K videos playback fantastic. However, once direct mode is removed they are extremely choppy. Sometimes they will become smooth for a few seconds but then go back to choppy.

CrendKing commented 2 years ago

If you don't have any script, when remote is also disabled, AVSF automatically unloads itself, making LAV directly connects to the renderer. That's why the log is small, and LAV is able to use "d3d11 native" mode.

When remote is enabled, AVSF won't unload itself. LAV will connect to AVSF first, which is connected to the renderer. And all AviSynth/VapourSynth things work in CPU, so LAV has to copy frames from GPU to main memory, thus "d3d11 cb direct" mode.

If you want to use this filter, you have to suffer that memory copy penalty. If you hardware can't support 8K without copyless mode, then you might want to consider disabling AVSF, or write a script that conditionally unload based on video resolution.

sofakng commented 2 years ago

Thanks for the explanation and that explains what is happening.

However, when AVSF is connected shouldn't it still be using d3d11 cb direct instead of just d3d11 cb? (that was the issue from #63)

Also, do you have any idea what memory speed/requirement I would need for larger video files like these?

CrendKing commented 2 years ago

It is d3d11 cb direct in my end:

Clipboard 1

Also, do you have any idea what memory speed/requirement I would need for larger video files like these?

I have no idea. Everyone's hardware, software and script are different.

sofakng commented 2 years ago

You are seeing d3d11 cb direct with Remote Control enabled?

Also, can I benchmark the memory throughput somehow using vspipe or something maybe? (to test the difference with RAM overclocking or faster speeds, etc)

CrendKing commented 2 years ago

You are seeing d3d11 cb direct with Remote Control enabled?

Yes. It doesn't matter because RC does not affect anything in terms of format during pin connection. The earliest time external app such as SVP can inject is when the input pin becomes active.

Benchmark

I'm not sure what you are trying to achieve. Do you have a particular script you think is slowing down the process? If so, replace it with a no-op script and try again. If then it becomes OK, you know the script needs change. If still bad, then bottleneck is AVSF/VPSF. Then we can talk about the performance issue in the filter. We can profile the process to see where most the resource is spent on.

sofakng commented 2 years ago

Here is an example video: https://user-images.githubusercontent.com/2228499/165659290-77df266b-af08-4e56-9f56-2c2630741183.mp4

Please ignore the resolution and aspect ratio (I've recreated this to match a virtual reality video causing the same problem).

When I enable remote control, with no script assigned, LAV reports d3d11 cb. However, if I disable remote control, LAV reports d3d11 cb direct.

I'm not using any script but only enabling/disabling the remote control checkbox.

As for the benchmark, I'm trying to troubleshoot by not using any script at all. Simply enabling remote control (with no script assigned), causes my 6K (and 8K) videos to stutter so it's nothing in a particular script.

It's not the d3d11 (or dxva2) copyback that seems to causing the stutter. The stutter only occurs when I enable AVSF (and enable remote control, but no script so it should be attached but not active processing anything?)

CrendKing commented 2 years ago

Thanks for the video. It is the culprit of why you are only seeing "cb" while I'm seeing "cb direct".

Since issue #63 I know that LAV enables the "direct" mode if the media sample's stride is multiple of 16 or 32, depending on format. If the video's width happens to be a multiple of that number, then by doing nothing we got the direct mode. For example, my test video is 1280x720, where 1280 is multiple of 16. Your video is 5800x2900, where 5800 is NOT multiple of 16.

Previously AVSF has no negotiation logic to align the stride, so the stride is always the same as the width of the video. https://github.com/CrendKing/avisynth_filter/actions/runs/2238833269 is a test build that always aligns stride to 32. So you should always have "cb direct" mode now.

The second problem you mentioned, where if you enable RC without any script the performance is poor, is caused by a combination of how AviSynth works, and how AVSF's RC works. Basically in AVSF, RC enabled + no script is equivalent to use a script as follows:

AvsFilterSource()

As you may notice, there is no Prefetch() here. Due to the single threaded nature of AviSynth, without prefetcher the performance on large video will suffer significantly. In comparison, VapourSynth's API is mainly async, so if you do the same thing on VPSF you won't see any performance issue.

You may ask why I do not add Prefetch() in this case. The reason is that RC is mainly used for external apps to inject their script at early stage, and the injected script should have its own Prefetch() statement.

If you have a valid use case where injection happens conditionally, or late into playback, I suggest you provide a simple avs script like

AvsFilterSource()
Prefetch(8)

or switch to VPSF.

sofakng commented 2 years ago

Thanks again so much for the help.

I understand what you are saying about media sample's stride and that 5800 is not a multiple of 16 or 32 so that explains direct mode not working.

However, what is the purpose of your test build that you linked? You mentioned it will force the stride to 32 and I've tried it with my test video but it crashes (which I guess is expected because the resolution isn't a multiple of 32 as we discussed?)

Also, thanks for the information about AVSF. Just for testing, I've created a script with the exact two lines you listed above and I'm seeing the same stutter. Does that indicate a performance problem elsewhere on my system? (ie. RAM speed/bandwidth)

Thanks SO MUCH for working with me through this problem!

CrendKing commented 2 years ago

it crashes

I can't reproduce a crash. Please upload your log.

which I guess is expected

No. I updated the filter so you don't need to do anything and always have the direct mode enabled, regardless of what video you watch.

Crash is never expected :-)

a performance problem elsewhere on my system

How much CPU usage you are seeing? If it is already close to 100%, then your system can't handle this. However, I have a Ryzen 5800X, that two line only costed me like 15%. So it shouldn't be that bad, unless your CPU is very old.

Also, there is chance you are not applying the script correctly. Log would show me something.

sofakng commented 2 years ago

OK - Here is my log file from the debug version crashing: avisynth_filter.log

I'm using MPC-HC v1.9.21.2: image

When I start playing a video with remote control enabled (but no script specified), MPC-HC crashes and it uploads this dump file: https://drdump.com/UploadedReport.aspx?DumpID=95836898


Here is an additional log file using the non-debug version of AVSF and the example script above (ie. AvsFilterSource() and Prefetch(8)): avisynth_filter.log

It stutters in the beginning and then has smooth playback but if I seek to anywhere in the video it stutters for about 5 seconds every time and then is fine. MPC-HC doesn't report any dropped frames so it's something else.

My CPU is at 25% and I have 32 GB of RAM. The file is located on a local NVME SSD so I don't think that's the issue either but let me know if you see anything.

CrendKing commented 2 years ago

Gotcha. It is due to your AviSynth+ version being 3.5. Recently I changed to call a function that only exists in 3.7 because the old one is deprecated. Can you try 3.7.2?


Change from Prefetch(8) to Prefetch(16) or even 24 might help.

sofakng commented 2 years ago

I think you've solved it. Upgrading to AviSynth+ v3.7.2 fixed the crash and also enabled direct mode. The stutters are now basically gone except for about a 0.5 second stutter when seeking.

One last question before this is closed --

You mentioned that LAV filters automatically enabled direct-mode if the resolution was a multiple of 16 or 32 but this changed it to always use a stride of 32 so it will work with all videos in the future regardless of the resolution being a multiple of 16 or 32?

CrendKing commented 2 years ago

0.5 second stutter is ideal but also not very bad on such large video. You could also try the VapourSynth variant, could be better. Still, if you don't have anything to do with the filter (e.g. not use SVP, not use mvtools, etc.) you don't need to use this filter at all.

To your question: yes. Old code requires video width to be multiple. New code does not. It will automatically adapt.

Reopen if needed.