elFarto / nvidia-vaapi-driver

A VA-API implemention using NVIDIA's NVDEC
Other
1.22k stars 56 forks source link

HWacceleration stopped working with recent updates - with solution #182

Closed Korothi closed 1 year ago

Korothi commented 1 year ago

Hi, tested on wayland and latest everything. i solved the issue already by adding changing this: widget.dmabuf.force-enabled to true heres how about:support looks: image image env (wayland): image proof that it works (somehow): image

ill give a try in x11 later.

crimist commented 1 year ago

I also require widget.dmabuf.force-enabled=true when using Wayland. Maybe we should update the README to reflect that it may still be required on NVIDIA drivers >495?

elFarto commented 1 year ago

Seems I'm suffering from this issue aswell. It seems deleting the gfx.blacklist.dmabuf keys from about:config and restarting will let it work, until I restart and they reappear.

In the about:support page, do you have Blocklisted; failure code FEATURE_FAILURE_DL_BLOCKLIST_NO_ID in the DMABUF section when it's not working?

Monsterovich commented 1 year ago

Maybe we should update the README to reflect that it may still be required on NVIDIA drivers >495?

@elFarto The driver 525.85.05 now requires this option.

elFarto commented 1 year ago

Maybe we should update the README to reflect that it may still be required on NVIDIA drivers >495?

@elFarto The driver 525.85.05 now requires this option.

I'm not so sure. Something's changed in Firefox, and I think it's just a bug as clearing out the errors in about:config lets it work normally, once, after restarting. Using MOZ_DRM_DEVICE or force-enable to override this is just working around it.

elFarto commented 1 year ago

Ok, after running the latest nightly version I think I see what's happened. There's an unrelated issue that's caused the FireFox team to disable DMA-BUF for certain NVIDIA driver versions.

So yes, for the moment widget.dmabuf.force-enabled=true is needed to override this. I'll update the README.

Monsterovich commented 1 year ago

Maybe we should update the README to reflect that it may still be required on NVIDIA drivers >495?

@elFarto The driver 525.85.05 now requires this option.

I'm not so sure. Something's changed in Firefox, and I think it's just a bug as clearing out the errors in about:config lets it work normally, once, after restarting. Using MOZ_DRM_DEVICE or force-enable to override this is just working around it.

I'm still sitting on FF 107.0, and all that has changed for me is the driver.

elFarto commented 1 year ago

I believe the blacklist is something that can be updated by the Firefox team remotely, so it doesn't require you to update your version.

philipl commented 1 year ago

Despite the bug saying it is fixed in 530, I saw it blocked on 530 here. 🙄

jeois commented 1 year ago

Sorry for hijacking that other thread #179 where integrated GPU users were setting MOZ_DRM_DEVICE.

I'm still not sure if I how I fixed Nvidia driver 530 as I likely had widget.dmabuf.force-enabled set to true this whole time, but perhaps I did toggle it at some point. In any case, this is the best explanation after reading that bugzilla thread.

I've always had FF crashes upon resume from sleep. However, I can't believe their solution was to disable DMABUF for all Nvidia drivers 510-530. So in order to "fix" something which only requires the user to restart their application upon resume, they blacklisted everything and disabled our functionality even when it was running fine, smh.

Based on their conversation, they think this issue cropped up in Nvidia version 525 series and then added 510 and 515, but I have had that suspend/crash issue for years and disabling DMA_BUF never solved it. They do seem to be getting our crash reports though. They assumed it would be an easy fix from Nvidia in the next version, which would have been 530 series at the time, but it seems they also blocked 530 as well, wth.

Darkspirit commented 1 year ago

So yes, for the moment widget.dmabuf.force-enabled=true is needed to override this. I'll update the README.

a) Please don't because users would run into a confirmed use-after-free Nvidia EGL driver bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1788573#c26 b) Please do it only if you explicitly state that the Nvidia suspend service must be enabled and NVreg_PreserveVideoMemoryAllocations=1 must be set:

jeois commented 1 year ago

b) Please do it only if you explicitly state that the Nvidia suspend service must be enabled and NVreg_PreserveVideoMemoryAllocations=1 must be set

I can confirm that passing kernel parameter NVreg_PreserveVideoMemoryAllocations=1 and enabling systemd services nvidia-suspend, (nvidia-hibernate) and nvidia-resume actually did fix the crash on suspend/resume. The ArchWiki reference recommends creating /etc/modprobe.d/nvidia-power-management.conf with line "options nvidia NVreg_PreserveVideoMemoryAllocations=1 NVreg_TemporaryFilePath=/var/tmp" which also works after enabling the services. Thanks for those references and suggesting to use Preserve Video Memory.

However, I still don't understand why this wasn't implemented instead of blocking DMA_BUF for Nvidia drivers, including 530 which should eventually include a fix. I hardly can imagine scenarios where users would be interrupted by sleep/resume and somehow discard work in a browser without saving. Therefore, I think a crash after suspend is a minor inconvenience at worst: a quick browser restart at resume. For those of us who have been trying to enable hardware video acceleration, disabling direct memory buffers isn't a solution even worth considering since it blocks key functionality only to alleviate a relatively small inconvenience, especially when we could have used Preserve Video Memory Allocation to prevent the crashes instead.

POMATu commented 1 year ago

a) Please don't because users would run into a confirmed use-after-free Nvidia EGL driver bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1788573#c26

I do not use suspend/hibernate at all because i am on desktop. My PC never turns off. So im not affected by this suspend bug. But without that setting you just get no hardware video decoding which is the main point of this project. Hence I think all you mentioned should be explained in readme so users at least could know what they can try to get it back working.

Because in 2023th year its not possible to find anything useful in search engines. And reading each github issue is not very practical either.

Vednier commented 1 year ago

So yes, for the moment widget.dmabuf.force-enabled=true is needed to override this. I'll update the README.

In that case please add that its highly recommended to enable GPU process in Firefox with layers.gpu-process.enabled=true, which will absorb scary crashes mentioned by Darkspirit. According to Mozilla devs GPU process was disabled by default because its incompatible with Wayland, however its still perfectly working on X11.

Vednier commented 1 year ago

Bad news - fix for [@ NvGlEglGetFunctions ] didnt made it to newest Nvidia drivers, so Mozilla decided to block DMABUF on Nvidia globally - https://bugzilla.mozilla.org/show_bug.cgi?id=1824778#c10

POMATu commented 1 year ago

Bad news - fix for [@ NvGlEglGetFunctions ] didnt made it to newest Nvidia drivers, so Mozilla decided to block DMABUF on Nvidia globally - https://bugzilla.mozilla.org/show_bug.cgi?id=1824778#c10

Will there be any way to force enable dmabuf? I am on old nvidia drivers (515) and dont experience any crashes

Darkspirit commented 1 year ago

You need to enable the Nvidia suspend service, NVreg_PreserveVideoMemoryAllocations=1 must be set, and then you can enable widget.dmabuf.force-enabled on about:config in Firefox:

Vednier commented 1 year ago

then you can enable widget.dmabuf.force-enabled on about:config in Firefox

And that is risks i you simply force enable DMABUF without suspend service? Just crashes? GPU process makes this problem relatively minor.

ManuLinares commented 1 year ago

I had a problem, didn't see the change in the README. Thanks to this thread I enabled "widget.dmabuf.force-enabled" and now it works again.

I don't suspend/hibernate neither, so only enabled this.

nvidia-dkms 530.41.03-1
firefox 111.0.1-1.1