elFarto / nvidia-vaapi-driver

A VA-API implemention using NVIDIA's NVDEC
Other
1.19k stars 54 forks source link

Not working on Optimus laptop #11

Open Iliolou opened 2 years ago

Iliolou commented 2 years ago

Thank you for your effort. Compiles fine on Gentoo, but can't run. I am on Skylake with GeForce GTX 950M. mpv with nvdec works fine. Here are the logs:


vainfo

libva info: VA-API version 1.12.0
libva info: User environment variable requested driver 'nvidia'
libva info: Trying to open /usr/lib64/va/drivers/nvidia_drv_video.so
libva info: Found init function __vaDriverInit_1_0
[7599-7599] ../src/vabackend.c:1522       __vaDriverInit_1_0 Initing NVIDIA VA-API Driver
[7599-7599] ../src/export-buf.c:  90          findCudaDisplay Found 4 EGL devices
[7599-7599] ../src/export-buf.c:  94          findCudaDisplay Got EGL_CUDA_DEVICE_NV value '0' from device 0
[7599-7599] ../src/export-buf.c: 121             initExporter Got EGLDisplay from CUDA device
[7599-7599] ../src/export-buf.c:  56                reconnect Reconnecting to stream
libva info: va_openDriver() returns 0
vainfo: VA-API version: 1.12 (libva 2.12.0)
vainfo: Driver version: VA-API NVDEC driver
[7599-7599] ../src/vabackend.c: 202              vaToCuCodec vaToCuCodec: Unknown codec: 2
[7599-7599] ../src/vabackend.c: 202              vaToCuCodec vaToCuCodec: Unknown codec: 3
[7599-7599] ../src/vabackend.c: 202              vaToCuCodec vaToCuCodec: Unknown codec: 4
[7599-7599] ../src/vabackend.c: 202              vaToCuCodec vaToCuCodec: Unknown codec: 12
vainfo: Supported profile and entrypoints
      VAProfileMPEG2Simple            : VAEntrypointVLD
      VAProfileMPEG2Main              : VAEntrypointVLD
      VAProfileVC1Simple              : VAEntrypointVLD
      VAProfileVC1Main                : VAEntrypointVLD
      VAProfileVC1Advanced            : VAEntrypointVLD
      <unknown profile>               : VAEntrypointVLD
      VAProfileH264Main               : VAEntrypointVLD
      VAProfileH264High               : VAEntrypointVLD
      VAProfileH264ConstrainedBaseline: VAEntrypointVLD
[7599-7599] ../src/vabackend.c:1511              nvTerminate In nvTerminate
[7599-7599] ../src/export-buf.c:  36                    debug [EGL] eglStreamImageConsumerConnectNV: EGL_BAD_STATE_KHR error: In EGL Access Table::stream2.consumer.disconnect: Consumer handle does not match reservation (0x4358c771 vs 0x4358c779).

mpv

[vo/gpu/vaapi-egl] Trying to open a x11 VA display...
[7573-7582] ../src/vabackend.c:1522       __vaDriverInit_1_0 Initing NVIDIA VA-API Driver
[7573-7582] ../src/export-buf.c:  90          findCudaDisplay Found 4 EGL devices
[7573-7582] ../src/export-buf.c:  94          findCudaDisplay Got EGL_CUDA_DEVICE_NV value '0' from device 0
[7573-7582] ../src/export-buf.c: 121             initExporter Got EGLDisplay from CUDA device
[7573-7582] ../src/export-buf.c:  56                reconnect Reconnecting to stream
[vo/gpu/vaapi-egl/vaapi] Initialized VAAPI: version 1.12
[7573-7582] ../src/vabackend.c: 865      nvQueryImageFormats In nvQueryImageFormats
[vo/gpu/vaapi-egl] Going to probe surface formats (may log bogus errors)...
[7573-7582] ../src/vabackend.c: 202              vaToCuCodec vaToCuCodec: Unknown codec: 2
[7573-7582] ../src/vabackend.c: 202              vaToCuCodec vaToCuCodec: Unknown codec: 3
[7573-7582] ../src/vabackend.c: 202              vaToCuCodec vaToCuCodec: Unknown codec: 4
[7573-7582] ../src/vabackend.c: 202              vaToCuCodec vaToCuCodec: Unknown codec: 12
[7573-7582] ../src/vabackend.c: 391           nvCreateConfig In nvCreateConfig with profile: 0 with 0 attributes
[7573-7582] ../src/vabackend.c:1233 nvQuerySurfaceAttributes with 1 (nil) 0
[7573-7582] ../src/vabackend.c:1233 nvQuerySurfaceAttributes with 1 0x7f92987dab00 5
[7573-7582] ../src/vabackend.c: 472        nvCreateSurfaces2 creating 1 surface(s) 128x128, format 1
[7573-7582] ../src/vabackend.c: 957            nvDeriveImage In nvDeriveImage
[7573-7582] ../src/export-buf.c:  36                    debug [EGL] eglDestroyImageKHR: _eglDestroyImageCommon
[7573-7582] ../src/export-buf.c:  36                    debug [EGL] eglDestroyImageKHR: _eglDestroyImageCommon
[7573-7582] ../src/export-buf.c:  36                    debug [EGL] eglDestroyImageKHR: _eglDestroyImageCommon
[7573-7582] ../src/export-buf.c:  36                    debug [EGL] eglDestroyImageKHR: _eglDestroyImageCommon
[7573-7582] ../src/vabackend.c:1438    nvExportSurfaceHandle got 0x7f9298250060
[7573-7582] ../src/export-buf.c: 326            exportCudaPtr eglExportDMABUFImageQueryMESA: 0x7f92987dd291 NV12 (3231564e) planes:2 mods:300000000606014 300000000606013
[7573-7582] ../src/export-buf.c:  36                    debug [EGL] eglExportDMABUFImageMESA: EGL_BAD_MATCH error: In eglExportDMABUFImageMESA: EGLImage can't be exported to dma-buf
[7573-7582] ../src/export-buf.c: 331            exportCudaPtr Unable to export image
[7573-7582] ../src/export-buf.c:  36                    debug [EGL] eglCreateImageKHR: invalid pitch
[7573-7582] ../src/vabackend.c: 391           nvCreateConfig In nvCreateConfig with profile: 1 with 0 attributes
[7573-7582] ../src/vabackend.c:1233 nvQuerySurfaceAttributes with 1 (nil) 0
[7573-7582] ../src/vabackend.c:1233 nvQuerySurfaceAttributes with 1 0x7f92987e03c0 5
[7573-7582] ../src/vabackend.c: 472        nvCreateSurfaces2 creating 1 surface(s) 128x128, format 1
[7573-7582] ../src/vabackend.c: 957            nvDeriveImage In nvDeriveImage
[7573-7582] ../src/vabackend.c:1438    nvExportSurfaceHandle got 0x7f9298250060
[7573-7582] ../src/export-buf.c: 231            exportCudaPtr cuda error 'unspecified launch failure' (719)

mplayer-vaapi
VO: Description: VA API with X11
VO: Author: Gwenole Beauchesne <gbeauchesne@splitted-desktop.com>
[7544-7544] ../src/vabackend.c: 472        nvCreateSurfaces2 creating 1 surface(s) 1280x692, format 1
[7544-7544] ../src/vabackend.c: 472        nvCreateSurfaces2 creating 1 surface(s) 1280x692, format 1
[7544-7544] ../src/vabackend.c: 957            nvDeriveImage In nvDeriveImage
[7544-7544] ../src/vabackend.c: 899            nvCreateImage created image id: 3
[7544-7544] ../src/vabackend.c: 957            nvDeriveImage In nvDeriveImage
[7544-7544] ../src/vabackend.c: 899            nvCreateImage created image id: 5
*** [scale] Exporting mp_image_t, 1280x692x12bpp YUV planar, 1328640 bytes
*** [vo] Allocating mp_image_t, 1280x692x12bpp YUV planar, 1328640 bytes
[7544-7544] ../src/vabackend.c:1091               nvPutImage In nvPutImage
[ass] PlayResX undefined, setting to 384
[7544-7544] ../src/vabackend.c: 855             nvPutSurface In nvPutSurface
[vo_vaapi] vaPutSurface(): the requested function is not implemented
A:   0.0 V:   0.0 A-V:  0.013 ct:  0.000   0/  0 ??% ??% ??,?% 0 0
[h264 @ 0x7f032d2246c0]nal_unit_type: 1(Coded slice of a non-IDR picture), nal_ref_idc: 0
[7544-7544] ../src/vabackend.c:1091               nvPutImage In nvPutImage
[7544-7544] ../src/vabackend.c: 855             nvPutSurface In nvPutSurface

Thank you.
elFarto commented 2 years ago

Can you paste the output of sudo cat /sys/module/nvidia_drm/parameters/modeset?

Iliolou commented 2 years ago
~ # cat /sys/module/nvidia_drm/parameters/modeset
Y

I don't know if that matters, but I am on an Optimus laptop, however mpv with nvdec-copy works fine.

elFarto commented 2 years ago

It's almost certainly caused by being an Optimus setup. Unfortunately I don't have one of those to test with.

The error with MPV is odd, the driver seems prepared to export the image (the eglExportDMABUFImageQueryMESA call returns the expected values), but then fails on the actual export. But thinking about it, the export doesn't know where it's going to be imported, so it might not be an optimus issue after all?

I've not testing the library with mplayer, so it's unlikely to work there at all. It seems to be calling vaPutImage and vaPutSurface, which aren't implemented in the library at the moment.

Iliolou commented 2 years ago

Based on #14 comment, mpv works great on my Optimus laptop by using --hwdec=vaapi-copy, whereas --hwdec=vaapi fails. I had to do the prime offload first. I can see a GPU Utilization of 4% and Video Engine Utilization of 4% in NVIDIA Settings while playing a low-res 1280x692 movie. Thank you!


export __NV_PRIME_RENDER_OFFLOAD=1; export __GLX_VENDOR_LIBRARY_NAME=nvidia
export LIBVA_DRIVER_NAME=nvidia
mpv --hwdec=vaapi-copy test.mp4

[vo/gpu/opengl] Initializing GPU context 'x11egl'
[vo/gpu/opengl] EGL_VERSION=1.5
[vo/gpu/opengl] EGL_VENDOR=NVIDIA
[vo/gpu/opengl] EGL_CLIENT_APIS=OpenGL_ES OpenGL
[vo/gpu/opengl] Trying to create Desktop OpenGL context.
[vo/gpu/opengl] Choosing visual EGL config 0x28, visual ID 0x20
[vo/gpu/opengl] GL_VERSION='4.4.0 NVIDIA 470.86'
[vo/gpu/opengl] Detected desktop OpenGL 4.4.
[vo/gpu/opengl] GL_VENDOR='NVIDIA Corporation'
[vo/gpu/opengl] GL_RENDERER='NVIDIA GeForce GTX 950M/PCIe/SSE2'
[vo/gpu/opengl] GL_SHADING_LANGUAGE_VERSION='4.40 NVIDIA via Cg compiler'
[vo/gpu] Using FBO format rgba16f
[vd] Codec list:
[vd]     h264 - H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10
[vd]     h264_v4l2m2m (h264) - V4L2 mem2mem H.264 decoder wrapper
[vd]     libopenh264 (h264) - OpenH264 H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10
[vd]     h264_cuvid (h264) - Nvidia CUVID H264 decoder
[vd] Opening decoder h264
[vd] Looking at hwdec h264-vaapi-copy...
[vaapi] Initialized VAAPI: version 1.12
[vd] Trying hardware decoding via h264-vaapi-copy.
[vd] Selected codec: h264 (H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10)
[vd] Pixel formats supported by decoder: vdpau cuda vaapi_vld yuv420p
[vd] Codec profile: High (0x64)
[vd] Requesting pixfmt 'vaapi_vld' from decoder.
**[vd] Using hardware decoding (vaapi-copy).**
[vd] Decoder format: 1280x692 nv12 auto/auto/auto/auto/auto CL=mpeg2/4/h264
[vf] [in] 1280x692 [1803737147:1876720717] nv12 bt.709/bt.709/bt.1886/limited/display SP=1.000000 CL=mpeg2/4/h264
[vf] [out[cplayer] VO: [gpu] 1280x692 => 1280x719 nv12
[cplayer] VO: Description: Shader-based GPU Renderer
[vo/gpu] reconfig to 1280x692 [1803737147:1876720717] nv12 bt.709/bt.709/bt.1886/limited/display SP=1.000000 CL=mpeg2/4/h264
[statusline] AV: 00:00:03 / 01:36:44 (0%) A-V:  0.000
cubanismo commented 2 years ago

FWIW, I don't have any great theories here after reading through #33. If this really is some Optimus/PRIME/render-offload interaction issue, I don't know why it would fail here, as @elFarto is correct that the exporter has no idea up-front who will be importing the buffer and hence should happily export it. If there were to be cross-GPU sharing issues, I'd expect them to occur during import on the non-NV GPU instead.

elFarto commented 2 years ago

I've only tested the Optimus/PRIME setup with my Geforce 760 on the 470 drivers, and that fails when the Intel driver tries to import it. I assume that's some limitation with that series of drivers, as it works fine if it's the NVIDIA driver that imports it.

I do need to pull my 1060 out and put it in that machine to test it with the newer drivers, but that's a pain :smile:

cubanismo commented 2 years ago

The import-side failure won't be fixed in newer drivers either. The memory constraints work I've been involved with for years now would be needed for that.

The problem is the EGLImage memory is most likely going to be in GPU-local memory (vidmem), which the Intel driver won't be able to map into its GPU. There needs to be some way to access that GPU-local memory directly from 3rd-party devices (This is technically possible, but not with current driver code, and often isn't as optimal as it might seem anyway), or negotiate up front for a buffer in some more mutually agreeable location (System memory, generally). The latter is where memory constraint APIs would come in. Upstream drivers built on top of TTM solve this automatically when possible by dynamically migrating the memory to that shared location on import (roughly speaking), but our driver architecture doesn't allow this, and as mentioned, that's often times not actually the most optimal config. When dma-buf is used internally by higher-level APIs (Like PRIME render offload through Vulkan/GLX/EGL+X11), we can detect this case and internally insert a vidmem->sysmem blit or vice-versa, internally placing the dma-buf memory in sysmem, but with direct buffer access, there's no point in the API to accomplish this cleanly. This was another one of those things where the EGLStreams-based sharing model made things easier on drivers by providing a slightly higher-level abstraction, but eventually we'll have the right tools to expose the same level of functionality while still providing the lower-level access dma-buf-based sharing provides.

elFarto commented 2 years ago

That's good to know. Sounds like something resizable PCIe BAR would help with, since that would technically make all of GPU memory available, or is it a limitation on the Intel side?

Is there anything I can do in CUDA to get the buffer/CUarray placed in the correct location before exporting it?

cubanismo commented 2 years ago

Is there anything I can do in CUDA to get the buffer/CUarray placed in the correct location before exporting it?

I'm largely unfamiliar with the CUDA API, but I would imagine it would be hard to do with a surface that came from NVDEC. I'd be digging through the same CUDA manuals you would be trying to answer that.

philipl commented 2 years ago

I see https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__EGL.html#group__CUDA__EGL_1g7be3b064ea600a7bac4906e5d61ba4b7

where you can connect the consumer and specify which memory the consumer wants the EGL image to be in. If you were going to consume on the Intel iGPU side, then you'd want to use sysmem here, but what's the actual use-case. Even on an optimus laptop, if you are using this driver in the first place, you'd want to be using GL/Vulkan on the dGPU as well, or why bother? Use vaapi on the iGPU to go with Vulkan/GL there too.

It's theoretically interesting but I don't think it's solving a real problem. In fact, in the original reported problem, is it using mismatched iGPU vs dGPU? It should be all one way or the other.

cubanismo commented 2 years ago

Yeah, I had assumed someone was using an NV dGPU to decode, then sending the output to an iGPU for presentation via GL or some other mechanism. If you're using GL is a simple blitter, or sending the dmabuf straight to some display hardware for presentation, (Directly, or after forwarding to wayland/X11 via some socket protocol, or some other framework/API), that probably makes sense. However, if you're doing some non-trivial processing in GL or Vulkan, it would generally make sense to be doing that on the dGPU as well.

Separately, if the user really is attempting to do all this on the dGPU, I'm not sure why importing would fail just because there's also an iGPU in the system.

elFarto commented 2 years ago

@Iliolou Can you retest using the latest version, v0.0.3?

Iliolou commented 2 years ago

Latest version 0.0.4 works fine with no need to NV_PRIME_RENDER_OFFLOAD anymore on Optimus. To clarify: X11 runs on Intel not on Nvidia. mpv (even an old 0.32 version) works as good as with nvdec.

export LIBVA_DRIVER_NAME=nvidia mpv --hwdec=vaapi-copy test.mp4 - Success mpv --hwdec=vaapi test.mp4 - Failed to find VA X11 surface.

libva info: VA-API version 1.12.0
libva info: User environment variable requested driver 'nvidia'
libva info: Trying to open /usr/lib64/va/drivers/nvidia_drv_video.so
libva info: Found init function __vaDriverInit_1_0
libva info: va_openDriver() returns 0
vainfo: VA-API version: 1.12 (libva 2.12.0)
vainfo: Driver version: VA-API NVDEC driver
vainfo: Supported profile and entrypoints
      VAProfileMPEG2Simple            : VAEntrypointVLD
      VAProfileMPEG2Main              : VAEntrypointVLD
      VAProfileVC1Simple              : VAEntrypointVLD
      VAProfileVC1Main                : VAEntrypointVLD
      VAProfileVC1Advanced            : VAEntrypointVLD
      <unknown profile>               : VAEntrypointVLD
      VAProfileH264Main               : VAEntrypointVLD
      VAProfileH264High               : VAEntrypointVLD
      VAProfileH264ConstrainedBaseline: VAEntrypointVLD

It is a rather old GeForce GTX 950M. No HEVC or AV1.

philipl commented 2 years ago

Interop will only work if the application is using the nvidia gpu for X11. That generally means using:

__NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia

as per nvidia docs.

But if you're letting Intel drive X11, you'd be better off letting Intel handle the vaapi video decode as well.

Saancreed commented 2 years ago

--hwdec=vaapi-copy works on my own Optimus laptop too, also without PRIME, but --hwdec=vaapi with PRIME gives me a green screen and without PRIME results in a kernel oops that makes X/display unusable until reboot :sweat_smile:

dmesg ``` BUG: kernel NULL pointer dereference, address: 000000000000000c #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 3231ba067 P4D 3231ba067 PUD 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 7 PID: 6807 Comm: mpv/vo Kdump: loaded Tainted: P OE 5.16.10-zen1-1-zen #1 45ce4b459ccb24e2f8f3aee69f17c21decbb4b0c Hardware name: ASUSTeK COMPUTER INC. TUF Gaming FX705DU_FX705DU/FX705DU, BIOS FX705DU.316 01/28/2021 RIP: 0010:dma_map_sgtable+0x13/0x90 Code: ff e9 2c ff ff ff e8 6c 20 c1 00 66 66 2e 0f 1f 84 00 00 00 00 00 90 0f 1f 44 00 00 48 8b 87 40 02 00 00 41 89 d1 53 48 89 f3 <8b> 56 0c 48 8b 36 48 85 c0 48 0f 44 05 14 6b 52 02 41 83 f9 02 77 RSP: 0018:ffffaf18cce1fbd8 EFLAGS: 00010207 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000020 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9ef0c15c60d0 RBP: ffff9ef2d5005000 R08: ffff9ef2d5005010 R09: 0000000000000000 R10: 0000000000000000 R11: ffff9ef28ec9b808 R12: 0000000000000000 R13: 0000000000000000 R14: ffff9ef2d5005010 R15: ffff9ef0c38805b8 FS: 00007f463e38e640(0000) GS:ffff9ef7bf3c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000000c CR3: 000000031ec02000 CR4: 00000000003506e0 Call Trace: drm_gem_map_dma_buf+0x53/0xa0 dma_buf_dynamic_attach+0x154/0x280 amdgpu_gem_prime_import+0xd2/0x1e0 [amdgpu 9832e49f94618e69792064c36eec1cfc62591b25] drm_gem_prime_fd_to_handle+0xbd/0x1d0 ? drm_prime_destroy_file_private+0x20/0x20 drm_ioctl_kernel+0xb8/0x140 ? pty_write+0x86/0x90 drm_ioctl+0x22a/0x3d0 ? drm_prime_destroy_file_private+0x20/0x20 amdgpu_drm_ioctl+0x49/0x80 [amdgpu 9832e49f94618e69792064c36eec1cfc62591b25] __x64_sys_ioctl+0x82/0xb0 do_syscall_64+0x5c/0x80 ? syscall_exit_to_user_mode+0x23/0x40 ? do_syscall_64+0x69/0x80 ? syscall_exit_to_user_mode+0x23/0x40 ? do_syscall_64+0x69/0x80 ? syscall_exit_to_user_mode+0x23/0x40 ? do_syscall_64+0x69/0x80 ? syscall_exit_to_user_mode+0x23/0x40 ? do_syscall_64+0x69/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f465a113e6f Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <41> 89 c0 3d 00 f0 ff ff 77 1f 48 8b 44 24 18 64 48 2b 04 25 28 00 RSP: 002b:00007f463e38c870 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007f463e38c90c RCX: 00007f465a113e6f RDX: 00007f463e38c90c RSI: 00000000c00c642e RDI: 0000000000000013 RBP: 00000000c00c642e R08: 0000000000000002 R09: 0000000000000000 R10: 00007f4634ac65f0 R11: 0000000000000246 R12: 0000000000000002 R13: 0000000000000013 R14: 00007f46141b7c50 R15: 0000000000000000 Modules linked in: tun snd_seq_dummy snd_hrtimer snd_seq snd_seq_device rfcomm bnep ccm algif_aead cbc des_generic libdes ecb algif_skcipher cmac md4 algif_hash af_alg nls_iso8859_1 zfs(POE) vfat fat zunicode(POE) zzstd(OE) zlua(OE) zcommon(POE) znvpair(POE) zavl(POE) icp(POE) spl(OE) pkcs8_key_parser ntfs3(OE) intel_rapl_msr intel_rapl_common edac_mce_amd joydev ucsi_ccg rtw88_8822be kvm_amd snd_hda_codec_realtek amdgpu typec_ucsi rtw88_8822b snd_hda_codec_generic btusb ledtrig_audio asus_nb_wmi typec snd_hda_codec_hdmi btrtl kvm rtw88_pci roles asus_wmi snd_hda_intel btbcm hid_multitouch sparse_keymap uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 rtw88_core videobuf2_common platform_profile wmi_bmof crct10dif_pclmul crc32_pclmul mac80211 ghash_clmulni_intel snd_intel_dspcfg aesni_intel videodev btintel snd_intel_sdw_acpi crypto_simd libarc4 mousedev cryptd r8169 bluetooth mc snd_hda_codec rapl gpu_sched cfg80211 drm_ttm_helper ecdh_generic snd_hda_core ttm realtek pcspkr mdio_devres ccp rfkill snd_hwdep libphy snd_pcm snd_timer video sp5100_tco i2c_hid_acpi snd i2c_hid mac_hid soundcore i2c_nvidia_gpu i2c_piix4 k10temp nvidia_drm(POE) nvidia_modeset(POE) nvidia_uvm(POE) pinctrl_amd asus_wireless wmi acpi_cpufreq tpm_crb tpm_tis tpm_tis_core tpm rng_core nvidia(POE) ipmi_devintf ipmi_msghandler crypto_user fuse bpf_preload ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 hid_logitech_hidpp hid_logitech_dj usbhid serio_raw atkbd libps2 crc32c_intel xhci_pci xhci_pci_renesas i8042 serio vfio_pci vfio_pci_core irqbypass vfio_virqfd vfio_iommu_type1 vfio CR2: 000000000000000c ---[ end trace 2f8a7ddfff6c0abf ]--- RIP: 0010:dma_map_sgtable+0x13/0x90 Code: ff e9 2c ff ff ff e8 6c 20 c1 00 66 66 2e 0f 1f 84 00 00 00 00 00 90 0f 1f 44 00 00 48 8b 87 40 02 00 00 41 89 d1 53 48 89 f3 <8b> 56 0c 48 8b 36 48 85 c0 48 0f 44 05 14 6b 52 02 41 83 f9 02 77 RSP: 0018:ffffaf18cce1fbd8 EFLAGS: 00010207 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000020 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9ef0c15c60d0 RBP: ffff9ef2d5005000 R08: ffff9ef2d5005010 R09: 0000000000000000 R10: 0000000000000000 R11: ffff9ef28ec9b808 R12: 0000000000000000 R13: 0000000000000000 R14: ffff9ef2d5005010 R15: ffff9ef0c38805b8 FS: 00007f463e38e640(0000) GS:ffff9ef7bf3c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000000c CR3: 000000031ec02000 CR4: 00000000003506e0 ```

That's on driver version 510.47.03 and GTX 1660 Ti Mobile, while X11 is being driven by integrated AMD Radeon Vega 10.

elFarto commented 1 year ago

I spent several hours looking into this the other day, and I've come to the conclusion that Firefox can't support running on the non-default GPU (in X11 atleast). The gfxtest process that runs, has slightly different logic to create EGL contexts that prevents it from creating a context on the NVIDIA GPU. The process we've been trying to use just ends up resulting in there being no EGL drivers to choose from.

I did try some slight modifications to Firefox to make it work, but there look to be multiple places where changes would need to be made to get it working. Ideally we'd skip that and just make video decoding working on NVIDIA, with the rest of the rendering done by the Intel chip, but that's a far harder task.

iamkarlson commented 1 year ago

Could you please elaborate what would be "non-default" and how can I change that? I'm fine with running nvidia all the time instead of the intel's one

elFarto commented 1 year ago

The non-default GPU would be any GPU that the X server isn't running on. You would need to configure Xorg to use the NVIDIA GPU only.

I'm not entirely sure on the specifics of setting it up, but the Arch Wiki has some instructions on it.