Frogging-Family / nvidia-all

Nvidia driver latest to 396 series AIO installer
808 stars 71 forks source link

[Feature Request] NvFBC Patch #92

Open kekonn opened 2 years ago

kekonn commented 2 years ago

Would you consider integrating (an option to apply) the following: https://github.com/keylase/nvidia-patch

Or is it safe to apply to the drivers as is? I would think not since it's a binary patch and might interfere with other patches your system makes.

Arcitec commented 2 years ago

NvFBC has no uses anymore?

If you are using the OBS "NvFBC" plugin, then stop it. The plugin was just a temporary hack back when Wayland lacked any screen capture API (before things like pipewire). The OBS NvFBC plugin has been discontinued by its author because it offers zero benefits over the native X11/Wayland "full screen capture" driver that exists in normal OBS.

The NvFBC plugin works as follows:

The native non-NvFBC screen capture plugin in OBS ("Screen Capture (XSHM)") does the exact same thing and has the exact same performance, because NVIDIA's driver exposes the framebuffer to the native Linux screen capture API.

No need to hack your NVIDIA drivers with NvFBC patch. That technique is ancient.

There is only one area where NvFBC makes sense. But nobody on Linux has coded this yet. The idea is: Write a program that calls NvFBC to get the framebuffer address, and then send it directly to the NvENC API to encode it in-GPU without ever copying anything into system RAM. NOTHING ON LINUX DOES THIS. It would be awesome as an open source project idea though.

kekonn commented 2 years ago

Thank you for the clarification! I still see a ton of instructions recommending it, for example to increase performance with sunshine ( a user recommended using nvlax) on the aur page.

Arcitec commented 2 years ago

@kekonn Interesting. If Sunshine is able to push NvFBC -> NvENC directly then it would be a high performance solution. If not, then NvFBC is unnecessary, since the NVIDIA driver itself basically hooks the linux "desktop capture" API into NvFBC internally.

NvFBC only matters (and only provides a benefit) if you have software that directly sends the NvFBC data to NvENC without copying it to system RAM first.

It kinda sounds like Sunshine copies into RAM first (slow), since people complain about Sunshine latency here: https://github.com/loki-47-6F-64/sunshine/issues/316

kekonn commented 2 years ago

Yes, but someone on the aur mentioned that completely vanished when using nvidia-utils-nvlax.

Nevertheless, there is also the increased NVENC patch in there. Is it safe to apply this to your drivers?

GenocideStomper commented 2 years ago

The native non-NvFBC screen capture plugin in OBS ("Screen Capture (XSHM)") does the exact same thing and has the exact same performance, because NVIDIA's driver exposes the framebuffer to the native Linux screen capture API.

Sorry to butt into the conversation, but I don't see this behaviour at all. I'm on X11. For me on an old outdated CPU, Screen Capture (XSHM) used 2-3x the CPU resources, and the recording has noticeable hitches. NvFBC on the other hand doesn't cause a high CPU usage for OBS.

Arcitec commented 2 years ago

@spider3000 The author of the NvFBC plugin for OBS themselves said that there's no performance difference and that it uses the same API. In fact, the NvFBC-source plugin in OBS has been discontinued by its author like 1-2 years ago because of this reason. They recommend using the XSHM "desktop capture" built into OBS now.

They also explained everything about how the NvFBC-source plugin just copies the hardware framebuffer into system RAM (very slow), since OBS "needs" it to be able to do software compositing of other effects.

I am definitely interested if your results are real... I'll put "try NvFBC" on my endless "TODOs that I will never DO" hehe. xD

Even if the NvFBC plugin is somehow faster than XSHM on some computers, the real issue remains: The only way to get Windows-like capture performance (1% CPU usage, 5% GPU usage) is by using something that connects NvFBC directly into NvENC without copying to system RAM first.

GenocideStomper commented 2 years ago

I don't know what to tell you, other then I see a noticeable difference in htop between the two sources, and anecdotally speaking I feel my system is a bit more laggy with XSHM.

Quick test on desktop, I tried to record peaks: NvFBC: image

XSHM: image

Mind you, I don't really care about this feature request, because I can apply nvidia-patch myself, nvidia-patch can also lag behind by a few days, up to 2 weeks. And also the performance difference might only be noticeably on old hardware like my Haswell quad-core. I just didn't agree with the statement, and don't understand why the original author would say that the sources are the same. If it uses the same API, how come I need to use nvidia-patch's patch-fbc.sh for the NvFBC plugin, but don't need to patch for the XSHM source? Maybe it's a re-implementation of the API that doesn't have the exact same performance. I'm on X11, if that makes a difference.

Tk-Glitch commented 2 years ago

Or is it safe to apply to the drivers as is?

It is supposed to be. People have been doing that for a while just fine. I'm reluctant to support it directly but that shouldn't stop you from using the patch with our packages.

TeheeFB commented 2 years ago

https://git.dec05eba.com/gpu-screen-recorder/about/ looks like the all godlike program that actually uses this feature now exists, and i can vouch that it works. I'm not following this project at all, but when i used it to install my drivers and saw that it applies little hacks or optimizations or something like that, I assumed it also did the NvFBC patch, caused me a little bit of confusion. Check out the program, very cool.

krakow10 commented 2 years ago

The NvFBC plugin works as follows:

  • NvFBC API retrieves the memory address of the GPU's onboard framebuffer.
  • The OBS plugin copies all the data from that framebuffer into system RAM.
  • OBS itself does video compositing in the system RAM.
  • OBS sends the frame data from system RAM back into the GPU ram.

This is not correct. https://obsproject.com/forum/resources/obs-nvfbc.796/updates "Thanks to Torge Matthies this plugin will now copy frames from the GPU directory into OBS. Not more copies into system memory and back anymore."

I can personally vouch for the usefulness and performance benefits, I sorely missed using the plugin when a Vulkan update broke it last month.

dec05eba commented 2 years ago

The NvFBC plugin works as follows:

  • NvFBC API retrieves the memory address of the GPU's onboard framebuffer.
  • The OBS plugin copies all the data from that framebuffer into system RAM.
  • OBS itself does video compositing in the system RAM.
  • OBS sends the frame data from system RAM back into the GPU ram.

This is not correct. https://obsproject.com/forum/resources/obs-nvfbc.796/updates "Thanks to Torge Matthies this plugin will now copy frames from the GPU directory into OBS. Not more copies into system memory and back anymore."

I can personally vouch for the usefulness and performance benefits, I sorely missed using the plugin when a Vulkan update broke it last month.

The NvFBC plugin copies to an opengl texture yes, but obs itself cant upload the opengl texture to the hardware video encoder without first copying the opengl texture to the ram and then back to the video encoder (even when using nvenc). This obs pull request fixes that: https://github.com/obsproject/obs-studio/pull/4974 but as Bananaman said I believe it still uses cpu for certain composition (and color spaces conversions) so it still copies the data to cpu.

Right now the NvFBC plugin is faster than XSHM but not really faster than XComposite, but XComposite has downsides (such as not being able to capture windows with client side decorations (gnome), breaking vsync when not using a compositor and breaking gsync).

In my own screen recorder I use NvFBC and use the data in NvENC without copying and the performance isn't really better than using XComposite (in the same program). The main benefit are those things as I said above, with it breaking less things and also being able to record your entire monitor instead of only a single window.

Fox2Code commented 1 year ago

I use nvidia-utils-nvlax from chaotic-aur on my linux system and I have NVENC working fine on obs.

nvlax could probably be added as an option in nvidia-tkg build process -> https://github.com/illnyang/nvlax

NvFBC plugin for obs has been deprecated, according to the developer it cannot be ported due to OBS 28+ using EGL instead of GLX. (Source: https://gitlab.com/fzwoch/obs-nvfbc)

ryanmusante commented 1 year ago

Is this ONLY for encoding and not for decoding usage?

https://aur.archlinux.org/packages/nvlax-git