Nevcairiel / LAVFilters

LAV Filters - Open-Source DirectShow Media Splitter and Decoders
GNU General Public License v2.0
7.34k stars 789 forks source link

d3d11: Add fast clearing path if 11.1 is available #463

Closed kasper93 closed 2 years ago

Nevcairiel commented 2 years ago

Unfortunately this doesn't seem to work for P010/P016 for me, still green. Only did a quick test though due to lack of time, not tried to dig deeper yet.

clsid2 commented 2 years ago

ClearView() is an optional feature? https://docs.microsoft.com/en-us/windows/win32/api/d3d11/ns-d3d11-d3d11_feature_data_d3d11_options

kasper93 commented 2 years ago

Ok guys, let me fix that. I just quickly put it together after work...

ClearView() is an optional feature? https://docs.microsoft.com/en-us/windows/win32/api/d3d11/ns-d3d11-d3d11_feature_data_d3d11_options

I don't really understand the docs here. image

So for feature level 9..{1, 2, 3} it is emulated in runtime and always available and can become unavailable with higher levels, because they don't emulate it anymore? And it is nice touch there is no mention about any of this on the ClearView() doc page. So is it always emulated if not available? Or in this case it is just dummy call? Well that would be stupid, there is no error status on it.

clsid2 commented 2 years ago

I assume it will probably just be a no-op.

Googling I found code example: https://chromium.googlesource.com/angle/angle/+/chromium/2272/src/libANGLE/renderer/d3d/d3d11/Clear11.cpp#135

Nevcairiel commented 2 years ago

Hardware calls typically have no error returns, you can possibly see something with debug context though.

In any case, ClearView works with NV12 for me, so I would suspect that the support bit would be set. Unless thats a fluke and its not actually set?

Edit: I checked and the compat flag is indeed set here. Either its just broken for P010 or something else is afoot.

kasper93 commented 2 years ago

Hardware calls typically have no error returns, you can possibly see something with debug context though.

I know for errors. I guess same for not supported calls, it is caller job to check caps...

https://microsoft.github.io/DirectX-Specs/d3d/archive/D3D11_3_FunctionalSpec.htm#ClearView

5.2.3.3 Alternative: ClearView [...] This feature is required to be supported for all D3D10+ hardware in D3D11.1 drivers and for D3D9 drivers maps to the already existing functionality there. The D3D9 equivalent honored the scissor rect, so emulation of ClearView on the D3D9 DDI will unset scissor / clear / reset scissor to achieve the intended behavior of ClearView (e.g. this scissor manipulation isn't needed on the new D3D11.1 ClearView DDI which ignores scissor/viewports by definition.).

I still don't get in what configuration it is NOT supported. But anyway, I will double-check the P010/P016 first.

clsid2 commented 2 years ago

Are those values in ClearYUV[] correct? Input is clamped to 0.0f to 1.0f range.

kasper93 commented 2 years ago

Are those values in ClearYUV[] correct? Input is clamped to 0.0f to 1.0f range.

Is it tho?

For video views with YUV or YCbBr formats, ClearView doesn't convert color values. In situations where the format name doesn’t indicate _UNORM, _UINT, and so on, ClearView assumes _UINT. Therefore, 235.0f maps to 235 (rounds to zero, out of range/INF values clamp to target range, and NaN to 0).

Nevcairiel commented 2 years ago

The values are fine for NV12, so there are no fundamental problems. Its just P010 that just seems to not do anything for me. I also tested using 0.5 or some other hardcoded values with no change.

(For testing, I edited libavcodec/dxva2.c ff_dxva2_common_end_frame to just return 0 at the top, so the textures never get decoded into, and thus remain their initial color)

The RTV method works on P010 for me, at least (and so did the upload method before), although because it does actual texture copies, it is going to be quite a bit more expensive then a simple clear.

kasper93 commented 2 years ago

Works fine for me, but indeed for P010, underlying surface is 16-bit (for DXGI_FORMAT_P010), so it has to be 1 << 15 instead 1 << 9 like it is now. Will update the patch later...

Nevcairiel commented 2 years ago

I was certain I tested that because I had that thought, but apparently not, or I screwed that case up. But you are right, that seems to be working.

Nevcairiel commented 2 years ago

Merged as 0bdaa5bce3d66eb7bcd4e7797427394630448ac7, and added a feature check on top, although the docs still have me left confused if the feature is truely optional or not.

kasper93 commented 2 years ago

Thanks, I'm back now, but party is over. And about the flush I was certain I did move it, well probably reverted it when cleaning before commiting.

Nevcairiel commented 2 years ago

Its fine, I wanted to get it into the next nightly for testing, because I want to make a release fairly soon. :)

clsid2 commented 2 years ago

Don't forget to update dav1d for upcoming release ;) There have been a bunch of useful changes since your recent update.

clsid2 commented 2 years ago

Just FYI. I had a FFmpeg configure issue after the gnutls update. Pkg-config failure due to a linking error (missing reference 'clock_gettime') of gnutls test. Fixed it by replacing your regular GCC 11.2 package with the POSIX threads package.

Nevcairiel commented 2 years ago

I thought about that for a bit and will likely rebuild stuff one last time with the old package to reduce the dependency, but then swap to the posix threads variants for all future builds after the LAV release. Its just more feature complete at this point, and doesn't seem to have any of the issues anymore that early versions had.

Anyhow, this really isn't a discussion thread. :)

clsid2 commented 2 years ago

Sorry for abusing this PR.

The rebuild gnutls unfortunately also has an issue. Avformat now has a dependency on GetSystemTimePreciseAsFileTime in kernel32, which is only available on Windows 8+. With a GCC 10.2 build.