psieg / Lightpack

Lightpack and Prismatik open repository
GNU General Public License v3.0
1.57k stars 188 forks source link

Add Nvidia Frame Buffer Capture (NVFBC and NVIFR) source please for hookless high performance capture in games! #235

Open v00d00m4n opened 5 years ago

v00d00m4n commented 5 years ago

Nvidia uses hardware FBC for its own streaming and recording software Expirience. Steam also uses it for streaming to steam link.

They work independant from any API uses in game or sofware. FBC effective for full screen framebuffer capture and IFR for windowed capture, her some details:

NVFBC

Captures the framebuffer (front buffer) without any involvement from OpenGL or Direct3D.

Effectively a direct copy of the framebuffer irrespective of which application(s) drew it.

It generally only works sensibly in fullscreen mode. If you render in windowed mode and use NVFBC, it is going to capture the entire screen including your desktop and other unrelated windows.

NVIFR

Slightly more complicated and less performant than NVFBC, this can capture a single application.

In my experience this used to be how Steam would stream windowed-mode applications. I have not seen this capture path in a very long time and I am glad because performance was awful whenever it was used.

Here some documentation https://developer.nvidia.com/sites/default/files/akamai/designworks/docs/NVIDIA%20Capture%20SDK%20Programming%20Guide.pdf

http://on-demand.gputechconf.com/gtc/2016/presentation/s6307-shounak-deshpande-get-to-know-the-nvidia-grid-sdk.pdf

Here the capture SDK itself: https://developer.nvidia.com/capture-sdk

ATI also has something similar, but im not very familiar with it, maybe google it.

Anyway, implementation of this will reduce CPU load and performance impact completely, please do this as soon as you can.

Krzychowy commented 5 years ago

@maxroehrl

I've made some initial tests and it seems to be working. DX9, DX12 and non exclusive fullscreen DX11 games now work with G-sync and it doesn't get disabled as soon as ambilight is turned on like it was with DDupl. Amazing work, I can't stress enough how much of a relief it is. I will do some more testing but from what I have seen so far this is an ultimate solution to all issues we were having here.

And there are also substantial performance gains over DDupl. For example in Metro Exodus DDupl causes 6,5% performance reduction, while with NvFBC the performance hit is only 2,1%. CPU side performance is also better, I've run the game at 1080p/Medium to create CPU bottleneck and performance reduction from DDupl was 3%, while with NvFBC it was only 1%.

Results in Time Spy are similar to Metro Exodus, 7% reduction to graphics score with DDupl and only 2,2% with NvFBC.

I have found one issue though, with DSR. I am using 3 Ambibox devices (Lightpack clones, they are even recognized as Lightpack by Prismatik) and depending on the game, when I use DSR only one device is working and the LEDs from other devices are turned off. As an example, in Castlevania Lords of Shadow (2011, DX9) using any multiplier of DSR causes this issue. But in Mad Max (2015, DX11) some lower multipliers work (up to 2.25x), but 3.00 and 4.00 doesn't work. I am using the newest upload of your Prismatik version and I've tried different Downscale Factors. This issue doesn't occur with DDupl.

maxroehrl commented 5 years ago

@zomfg I added percentage scaling which allows for better tuning of the downscaling. 100% = No scaling 50% = 2x downscaling 25% = 4x downscaling 12% = 8x downscaling

I now tested it with a GTX 670 and it stops capturing while playing an Amazon Prime video as intended (only one FHD monitor).

@sblantipodi I opened the pull request #258.

@Krzychowy I only tested DSR in Borderlands 2 with 2.00 and 4.00 and it seems to work. The screenshots from NvFBC were valid with DSR enabled but the game seems to make a difference.

Krzychowy commented 5 years ago

@maxroehrl

Turns out it doesn't turn off devices. It just doesn't see the frame properly, like if only some part of the frame was read by the software, and the rest was black. If I move the zone that is not displaying anything into the area that is read by the software it will light up. So the problem is not devices not working but for some reason majority of the frame is read as black. Like if there was a picture surrounded by huge black bars. The amount of not working LEDs increases with DSR multiplier. So it looks like it sees a native resolution frame only and moves it into top left corner, and it is taking less and less space as the DSR resolution increases. Makes sense.

Something like this (simple visualization created in Paint):

dsr

The frame (grey) should fill the entire space but it doesn't. On screen it does but apparently not for the capture.

This can be mitigated by creating separate profiles for each resolution and take this scaling into account, but that is a bit of a pain...

AndySledge commented 5 years ago

Works great, thank you so much !

v00d00m4n commented 5 years ago

@maxroehrl

Turns out it doesn't turn off devices. It just doesn't see the frame properly, like if only some part of the frame was read by the software, and the rest was black. If I move the zone that is not displaying anything into the area that is read by the software it will light up. So the problem is not devices not working but for some reason majority of the frame is read as black. Like if there was a picture surrounded by huge black bars. The amount of not working LEDs increases with DSR multiplier. So it looks like it sees a native resolution frame only and moves it into top left corner, and it is taking less and less space as the DSR resolution increases. Makes sense.

Something like this (simple visualization created in Paint):

dsr

The frame (grey) should fill the entire space but it doesn't. On screen it does but apparently not for the capture.

This can be mitigated by creating separate profiles for each resolution and take this scaling into account, but that is a bit of a pain...

I noticed that prizmakit in general has scalling issues. by defaul anything but 100% scaling scewing up UI, smae happens with capture are when DSR or DPI scaling enabled. Its really a big issue which needs a fix.

Krzychowy commented 5 years ago

Yes scaling issues are another problem, but not major.

Issue with DSR however is a bit more serious. There is a simple mitigation though, you just need to create different Prismatik profiles for each DSR resolution, which is a bit of work, but not terrible, especially if you use DSR often. It all scales normally, so if you for example use 4.00x DSR resolution, then the capture is going to only be for top left quarter of your actual display area. So if my native res is 3840x1600 then 4.00x resolution is 7680x3200, I just create black 7680x3200 picture in Paint and then create 3840x1600 white area in top left corner, save it, set is as my wallpaper temporarily and limit the capture area only to this white area, like that:

20190428_070647

You can do the same for any other resolution. It requires some work, especially for resolution multipliers that are not full numbers, but honestly as far as fixing the software issues goes this is as easy as it gets, it is extremely simple.

sblantipodi commented 5 years ago

@maxroehrl why your merge request isn't merged yet in a new version of prismatik?

sblantipodi commented 5 years ago

at this point I lost my hope to see @maxroehrl patch merged in the stable branch. how can we move?

zell2311 commented 5 years ago

I use two displays with two Adalight packs setup. Running @maxroehrl version of Prismatik from two user accounts on Windows 10, one for each of Adalight pack. It works beautifully with the exception of one scenario when full-screen application is on the second display – it triggers the first Adalight for some reason. Can this be fixed?

sblantipodi commented 5 years ago

this project is dead, we need devs, probably someone who can fork it and develop it more actively. @maxroehrl why don't you create an input text where you ask to the user to input the magic key if authorized by nvidia? this will remove the problem with licenses... what do you think about it?

zomfg commented 5 years ago

Just reading from the .ini would be simpler (and less obvious of a bypass), but I'm not sure that's the only part that's problematic. Reworking it into a (third party) plugin would be ideal for this situation.

sblantipodi commented 5 years ago

Is there someone who is able to do this plugin and want to share it on grey internet so that prismatik can continue to integrate it?

SudoBlocks commented 5 years ago

Tell someone, on this version, when i use NvFBC, after turning off the screen, the backlight does not turn on, how can I fix it? On the previous version, everything was fine. And what kind of downscale do I need to put on 1080ti on 2560x1440, what does it affect?

gnif commented 4 years ago

Please be aware that Looking Glass doesn't include the "Magic" or NVIDIA Capture SDK headers for a very good reason... it's in breach of the Capture SDK License agreement. NVIDIA has been very clear about this when people started "altering" LG and there were at least two github repos shutdown via legal action that I am aware of due to the offending code. NVIDIA defends this API, if they catch wind of your distribution of even the headers you could end up in very deep water very fast.

sblantipodi commented 4 years ago

@maxroehrl it seems that HDR mode in Red Dead Redemption 2 broke the nvidia framebuffer capture. If I enable HDR in the game while using the nvidia framebuffer capture, ambilight stops working, light is stopped. It works with win api.

Any idea? thanks

sblantipodi commented 4 years ago

@maxroehrl @psieg NVFBC can't be used due to "license problem".

Reading here: https://developer.nvidia.com/capture-sdk

they say that NVFBC is planned to be deprecated on Windows 10 starting with Windows 10 October 2019 Update for reasons explained in this document. Windows 10 provides native capture APIs that can be considered as alternatives to NVFBC. Explore this option (Github Sample).

This application demonstrates how to use Windows Desktop Duplication API (a.k.a. DDA or IDXGIOutputDuplication) to capture desktop images using a D3D11 context, and using NVENC D3D11 interop to efficiently compress the captured images for offline recording or live streaming.

https://github.com/NVIDIA/video-sdk-samples/tree/master/nvEncDXGIOutputDuplicationSample

isn't this good for Prismatik? isn't this the best way to get high performance capture without having problem with licenses?

zomfg commented 4 years ago

That demo project literally uses the same method/API for capturing frames as Prismatik's DDupl grabber. It's worth a look just in case, but unless there's some minor magic detail that they have in their version I wouldn't get my hopes up too much. The NVIDIA part of it happens after the frame is captured and is used to encode the frame for the video... which Prismatik does not need since it uses the raw RGB frame (downsized by GPU).

Edit: NVFBC deprecation doc says that it doesn't work well with virtual desktops/changing resolutions, containerized apps (Edge in private mode for ex) and the new GPU scheduling on W10. DDupl doesn't have those issues.

sblantipodi commented 4 years ago

@zomfg still no luck for a viable definitive solution for Prismatik and GSYNC with high performance grabbing. what maxroehrl released was wonderful but "it's not that legal" and it is now deprecated by microsoft/nvidia so I don't expect that it will work for a long time.

I really hope that we will find a viable solution to have high performance capture with GSYNC on.

zomfg commented 4 years ago

They also mention Windows.Graphics.Capture as an alternative, but I suspect it's just a C#/VB wrapper of DD for UWP stuff.

gnif commented 4 years ago

Windows.Graphics.Capture is a DXGI DD wrapper as you suspect. Also DD has far more latency in comparison to NvFBC. As for virtual desktops changing resolution, NvFBC actually works fine with a few minor quirks. I suspect they are dropping NvFBC for two reasons.

1) It's anti-competitive to allow ShadowPlay and Steam to use it on consumer GPUs, but nobody else. 2) Desktop Duplication works "well enough" for their liking.

I have it on good authority from developers at AMD that the DD implementation is a closed black box that only Microsoft know the internals of, which means that driver optimizations for DD are extremely unlikely to ever occur. DXGI DD will likely never reach the performance that NvFBC is capable of.

Krzychowy commented 4 years ago

It still works and probably will for quite some time but with Microsoft trying to enforce updating by making new features artificially locked to newer versions you will finally update and something will likely break. The issues mentioned in NVIDIA's document doesn't seem to affect gaming though, incorrectly scaled images we already see with DSR and can mitigate (also not like we need pixel level accurancy here, it is just color info for ambilight), security container is probably not a case either, only what kind of nonsense can Microsoft pull with these changes in scheduling is a question mark. Though the issue NVIDIA mentions here is unpredictable latency, so it's not like it is going to fall apart suddenly.

NVIDIA has 0-1 approach to things so if NVFBC cannot work perfect anymore then they ditch it, but as long as it will be supported in drivers in future generations, so not hard-ditched completely, I wouldn't panic yet. There are many far more outdated and abandoned things that still work.

Will see, this is not the kind of thing that will take an immediate effect. Though finding some reliable alternative would at last be great because it feels like some kind of freaking survival, every few months we come here because something is threatening this feature :P But this is typically what you get in the package with trying to go beyond.

Benik3 commented 4 years ago

I have it on good authority from developers at AMD that the DD implementation is a closed black box that only Microsoft know the internals of, which means that driver optimizations for DD are extremely unlikely to ever occur.

Funny, when I wrote on microsoft the DD breaks FreeSync/GSync, only response which I got was, that it must be a problem in the drivers and the GPU manufacturers must repair it...

sblantipodi commented 4 years ago

but if NVFBC has been discontinued, what ShadowPlay and Steam will use? DD has known problems with gsync and ShadowPlay and Steam can't have this problem obviously.

Krzychowy commented 4 years ago

What are other alternatives for this kind of ambilight clone btw? Getting some kind of passthrough from DP to some other device instead of having devices on USB and then needing software capture? Or maybe a second signal sent through HDMI? Though I can easily see this interfering with G-sync as well, not sure though.

Benik3 commented 4 years ago

It could be a solution. Anyway I found that DesktopDuplication works with VRR if you disable Full screen optimization. You must do it for each game (in properties of .exe). The global settings from registry doesn't works...

Krzychowy commented 4 years ago

It could be a solution. Anyway I found that DesktopDuplication works with VRR if you disable Full screen optimization. You must do it for each game (in properties of .exe). The global settings from registry doesn't works...

Unfortunately that doesn't work in vast majority of cases. The entire problem started when Microsoft started to remove Exclusive Fullscreen functionality from Windows, I think starting from version 1709, but up to 1803 you could mitigate this by using 'Disable Fullscreen Optimizations', which would bring back proper Fullscreen in games that support it, but starting from 1809, this no longer works. There may still be some rare exceptions, but overall it doesn't work. And then you also have to consider a whole host of games that do not support Exclusive Fullscreen in the first place (likely big majority of games overall). So in the end this is extremely unreliable solution. It also adds quite a bit of input lag and some games really feel much heavier with DesktopDuplication enabled. And then you have WinAPI which thoretically doesn't break VRR but is a hit or miss on whether it even works in particular game and adds a ridiculous tons of latency and jittering.

sblantipodi commented 4 years ago

It could be a solution. Anyway I found that DesktopDuplication works with VRR if you disable Full screen optimization. You must do it for each game (in properties of .exe). The global settings from registry doesn't works...

Unfortunately that doesn't work in vast majority of cases. The entire problem started when Microsoft started to remove Exclusive Fullscreen functionality from Windows, I think starting from version 1709, but up to 1803 you could mitigate this by using 'Disable Fullscreen Optimizations', which would bring back proper Fullscreen in games that support it, but starting from 1809, this no longer works. There may still be some rare exceptions, but overall it doesn't work. And then you also have to consider a whole host of games that do not support Exclusive Fullscreen in the first place (likely big majority of games overall). So in the end this is extremely unreliable solution. It also adds quite a bit of input lag and some games really feel much heavier with DesktopDuplication enabled. And then you have WinAPI which thoretically doesn't break VRR but is a hit or miss on whether it even works in particular game and adds a ridiculous tons of latency and jittering.

unfortunantly I agree with this, hope to see a better solution in the near future. it's a shame that NVIDIA does not even answer to kind requests of using NVFBC.

sblantipodi commented 4 years ago

Windows 10 provides native capture APIs has been released after DDUPL and I doubt that nvidia would dismiss NVFBC for something that work "as bad" as DDUPL in games. :)

So I would not consider Native Capture API on par with DDUPL, it does not have much sense, don't you agree?

I hope that someone will give Native Capture API a try.

Benik3 commented 4 years ago

I hope that someone will give Native Capture API a try.

What do you mean by Native Capture API? Can you post some link?

sblantipodi commented 4 years ago

I hope that someone will give Native Capture API a try.

What do you mean by Native Capture API? Can you post some link?

here it is: https://developer.nvidia.com/capture-sdk

there are some examples here: https://github.com/NVIDIA/video-sdk-samples/tree/master/nvEncDXGIOutputDuplicationSample

zomfg commented 4 years ago

Read the document from the page you link:

and as mentioned above there is good chance that WGC is a just wrapper for C# DDupl

and did you try the app I linked? it uses WGC

sblantipodi commented 4 years ago

But why nvidia should deprecate NVFBC if DDupl isn't good enough for gamers? DDupl does not work with GSYNC, so how they can do this isn't clear to me, probably they have a streaming "suite" that doesn't breake GSYNC.

How Prismatik process the screen capture after is captured with DDUPL? Is it possible that GSYNC is broken after the screen capture?

Is there some update to make to prismatik to switch from Win8 DDupl to Win10DDupl?

I tried the app you linked but it does not even work :)

Benik3 commented 4 years ago

I tried to play with DDupl example (not WGC) and update all libraries without change. I also tried OBS studio (which also use DDupl) and it has the same problem...

SudoBlocks commented 4 years ago

So, what about NVFBC now? This is the best capture for Nvidia at this moment, or last Prismatic version is better?

zomfg commented 4 years ago

But why nvidia should deprecate NVFBC if DDupl isn't good enough for gamers? DDupl does not work with GSYNC, so how they can do this isn't clear to me, probably they have a streaming "suite" that doesn't breake GSYNC.

ask nvidia and microsoft speaking of microsoft, w10 2004 is out with new WDDM, maybe that can help (I have no idea if it's directly related, just the closest change to the subject)

How Prismatik process the screen capture after is captured with DDUPL?

same way as with every other grabber, which is WinAPI on windows, so if you don't have the same issues with the latter...

Is it possible that GSYNC is broken after the screen capture?

highly doubt

Is there some update to make to prismatik to switch from Win8 DDupl to Win10DDupl?

there's only one DDupl API

So, what about NVFBC now? This is the best capture for Nvidia at this moment, or last Prismatic version is better?

in theory last Prismatik is not any better in this area

sblantipodi commented 4 years ago

do you know if prismatik uses this class for DDUPL? https://docs.microsoft.com/en-us/windows/win32/api/dxgi1_2/nf-dxgi1_2-idxgioutputduplication-acquirenextframe

zomfg commented 4 years ago

yes

SudoBlocks commented 4 years ago

Can someone compilate last changes to last NVFBC version, or add NVFBC support?

Benik3 commented 4 years ago

I just found new thing: VRR doesn't work only when you capture whole screen. If you capture only window (and it doesn't care if it's full screen game or e.g. Spotify app) VRR is working fine.

zomfg commented 4 years ago

as far as I'm aware, DDupl can only capture the whole screen, so whatever you used to do window capture might be using something else

Benik3 commented 4 years ago

It's probably the the WGC implementation. Tried with this example and with OBS: https://github.com/robmikh/Win32CaptureSample

sblantipodi commented 4 years ago

just to give some numbers, on my system, DDUPL capture at 38FPS, NVFBC capture at 320FPS

xD

Benik3 commented 4 years ago

? What should it mean? I don't have problem to capture at 75FPS and I don't see any impact on performance...

sblantipodi commented 4 years ago

? What should it mean? I don't have problem to capture at 75FPS and I don't see any impact on performance...

at what resolution? at 3840x2160 my RTX2080Ti/5930K can't capture at more than 38FPS using DDUPL. with NVFBC I can capture at 320FPS using prismatik.

Benik3 commented 4 years ago

1920x1080 AMD RX580
Maybe some problem with the 4k?

sblantipodi commented 4 years ago

1920x1080 AMD RX580 Maybe some problem with the 4k?

I don't think so, 4K is only 4 times bigger than 1080P :D

zomfg commented 4 years ago

DDupl framerate depends on what's happening on the screen. My stock 4670k with its iGPU can do 100 fps at 3440x1440 ¯\_(ツ)_/¯ NVFBC grabber COULD be counting wrong and/or sending the same frame multiple times in a row (but looks like the fork isn't there anymore to check this)

sblantipodi commented 4 years ago

DDupl framerate depends on what's happening on the screen. My stock 4670k with its iGPU can do 100 fps at 3440x1440 ¯_(ツ)_/¯ NVFBC grabber COULD be counting wrong and/or sending the same frame multiple times in a row (but looks like the fork isn't there anymore to check this)

I'm using a 3840x2160 monitor that is 70% more pixels than 3440x1440 more or less. I learned that this things does not scale linearly so the ratio isn't strange to me.

SudoBlocks commented 4 years ago

so, what about new NVFBC version?

sblantipodi commented 4 years ago

NVFBC is a no go due to licenses issues but it remains the only good way to do what prismatik does on nvidia cards.