elFarto / nvidia-vaapi-driver

A VA-API implemention using NVIDIA's NVDEC
Other
1.2k stars 56 forks source link

libva error: init failed #299

Closed diniamo closed 3 weeks ago

diniamo commented 4 months ago
NVD_LOG=1 vainfo ``` Trying display: wayland libva info: VA-API version 1.21.0 libva info: User environment variable requested driver 'nvidia' libva info: Trying to open /run/opengl-driver/lib/dri/nvidia_drv_video.so 4004.036737070 [108942-108942] ../src/vabackend.c: 168 init CUDA ERROR 'unknown error' (999) libva info: Found init function __vaDriverInit_1_0 4004.036760315 [108942-108942] ../src/vabackend.c:2188 __vaDriverInit_1_0 Initialising NVIDIA VA-API Driver: 40 4004.036762314 [108942-108942] ../src/vabackend.c:2197 __vaDriverInit_1_0 Now have 0 (0 max) instances 4004.036764039 [108942-108942] ../src/vabackend.c:2223 __vaDriverInit_1_0 Selecting Direct backend 4004.043594504 [108942-108942] ../src/direct/nv-driver.c: 267 init_nvdriver Initing nvdriver... 4004.043614921 [108942-108942] ../src/direct/nv-driver.c: 285 init_nvdriver NVIDIA kernel driver version: 555.42.02, major version: 555, minor version: 42 4004.043618421 [108942-108942] ../src/direct/nv-driver.c: 292 init_nvdriver Got dev info: 100 1 2 6 4004.046716108 [108942-108942] ../src/direct/direct-export-buf.c: 27 findGPUIndexFromFd CUDA ERROR 'initialization error' (3) 4004.046723471 [108942-108942] ../src/vabackend.c:2253 __vaDriverInit_1_0 CUDA ERROR 'initialization error' (3) libva error: /run/opengl-driver/lib/dri/nvidia_drv_video.so init failed libva info: va_openDriver() returns 1 vaInitialize failed with error code 1 (operation failed),exit ```

I use NixOS with the new 555 beta driver. NVD_BACKEND is set to direct and LIBVA_DRIVER_NAME to nvidia, not sure what other information to provide.

laengepl commented 3 months ago

@diniamo I think I managed to pinpoint the issue. Seems like the open kernel module doesn't support CUDA, or fails to set up the /dev devices correctly. Force the open kernel module off with hardware.nvidia.open = false; in your nixos configuration and it should work. Issue #194 seems relevant to our issue.

diniamo commented 3 months ago

@laengepl it is set to false and it always has been.

elFarto commented 3 months ago

These early 999 'unknown errors' are a bit of a pain to diagnose, but they indicate something severely wrong with your setup. Something like driver not installed correctly or permission problems with the /dev/nvidia* files.

diniamo commented 3 months ago

Are there things I can verify?

mirh commented 2 months ago

Can you check #253?

diniamo commented 2 months ago

Um, check what exactly?

mirh commented 2 months ago

Whether it's related to suspend, and if the NVreg_PreserveVideoMemoryAllocations=1 trick couldn't address it

diniamo commented 2 months ago

It's not related to suspension, however that flag seems to have solved it?? Thanks.

diniamo commented 2 months ago

Not sure if the issue should be closed, up to you.

diniamo commented 2 months ago

Never mind, looks like I'd already had that flag. I have no idea what fixed it then.

mirh commented 2 months ago

I actually have half a clue, and if I didn't casually botched the argument (turns out it has to be nvidia.NVreg_PreserveVideoMemoryAllocations=1 in the command line) I would be none the wiser.

Sometimes(?) suspension does not just screw with whatever applications you have opened. You are like screwed globally with everything until you restart the whole system (or at least not even logging out and in again could fix it for me).

Hopefully I'm not having problems again now.

diniamo commented 2 months ago

My issue wasn't related to suspension.

diniamo commented 3 weeks ago

Do note that this issue still happens with the open drivers.

mirh commented 3 weeks ago

The open drivers use mesa, which doesn't need any of this. Or even if it has bugs, you should report to them.

diniamo commented 3 weeks ago

Oh? Didn't know that. Either way, OpenCL fails with the open drivers as well, so I can't switch to them.

elFarto commented 3 weeks ago

The open drivers use mesa, which doesn't need any of this.

That's not correct as I understand it. The open source nouveau driver uses MESA, but NVIDIA's open source driver uses the same user space components as the closed source kernel driver.