Closed knutj closed 5 months ago
Resolved
"resolved"
I am hitting this same issue with NVIDIA-Linux-x86_64-550.90.05-vgpu-kvm.run How is this resolved?
nvidia-driver-runtime-n2f6s:/ # nvidia-smi No devices were found nvidia-driver-runtime-n2f6s:/ # lsmod | grep nvidia nvidia_vgpu_vfio 86016 0 nvidia 8699904 1 nvidia_vgpu_vfio mdev 28672 1 nvidia_vgpu_vfio vfio 45056 3 nvidia_vgpu_vfio,vfio_iommu_type1,mdev drm 634880 7 drm_kms_helper,drm_vram_helper,ast,nvidia,drm_ttm_helper,ttm kvm 1056768 2 kvm_amd,nvidia_vgpu_vfio irqbypass 16384 2 nvidia_vgpu_vfio,kvm nvidia-driver-runtime-n2f6s:/ # lspci | grep NVIDIA 41:00.0 VGA compatible controller: NVIDIA Corporation Device 26b2 (rev a1) 41:00.1 Audio device: NVIDIA Corporation Device 22ba (rev a1) nvidia-driver-runtime-n2f6s:/ # dmesg | grep NVIDIA [ 153.856013] NVRM: loading NVIDIA UNIX Open Kernel Module for x86_64 550.90.05 Release Build (dvs-builder@U16-I1-N08-05-1) Mon May 27 14:37:46 UTC 2024 [ 155.996611] NVRM: loading NVIDIA UNIX Open Kernel Module for x86_64 550.90.05 Release Build (dvs-builder@U16-I1-N08-05-1) Mon May 27 14:37:46 UTC 2024 nvidia-driver-runtime-n2f6s:/ # dmesg | grep nvidia [ 153.784296] nvidia: loading out-of-tree module taints kernel. [ 153.787846] nvidia: module verification failed: signature and/or required key missing - tainting kernel [ 153.808814] nvidia: externally supported module, setting X kernel taint flag. [ 153.810802] nvidia-nvlink: Nvlink Core is being initialized, major device number 511 [ 153.812787] nvidia 0000:41:00.0: enabling device (0000 → 0003) [ 153.812983] nvidia 0000:41:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none [ 153.862748] nvidia_vgpu_vfio: externally supported module, setting X kernel taint flag. [ 153.918226] nvidia-nvlink: Unregistered Nvlink Core, major device number 511 [ 155.942593] nvidia: externally supported module, setting X kernel taint flag. [ 155.945345] nvidia-nvlink: Nvlink Core is being initialized, major device number 511 [ 155.947949] nvidia 0000:41:00.0: vgaarb: changed VGA decodes: olddecodes=none,decodes=none:owns=none [ 156.001285] nvidia_vgpu_vfio: externally supported module, setting X kernel taint flag. [ 156.146216] nvidia 0000:41:00.0: Direct firmware load for nvidia/550.90.05/gsp_ga10x.bin failed with error -2 [ 156.146990] nvidia 0000:41:00.0: Direct firmware load for nvidia/550.90.05/gsp_ga10x.bin failed with error -2 [ 156.151932] nvidia 0000:41:00.0: Direct firmware load for nvidia/550.90.05/gsp_ga10x.bin failed with error -2 [ 156.152440] nvidia 0000:41:00.0: Direct firmware load for nvidia/550.90.05/gsp_ga10x.bin failed with error -2 [ 241.904348] nvidia 0000:41:00.0: Direct firmware load for nvidia/550.90.05/gsp_ga10x.bin failed with error -2
I made sure to install the right firmware. In my system I have install 560.28.03 in /lib/firmware/nvidia/560.28.03
Except this version of the driver is looking for 550.90.05:
[ 156.146216] nvidia 0000:41:00.0: Direct firmware load for nvidia/550.90.05/gsp_ga10x.bin failed with error -2
You have a bad driver installation. You might have multiple versions, or maybe the old one didn't uninstall properly, or whatever. You need to clean up your system.
Good luck.
NVIDIA Open GPU Kernel Modules Version
550.76
Please confirm this issue does not happen with the proprietary driver (of the same version). This issue tracker is only for bugs specific to the open kernel driver.
Operating System and Version
Description: Fedora release 40 (Forty)
Kernel Release
Linux knut 6.8.9-300.fc40.x86_64 #1 SMP PREEMPT_DYNAMIC Thu May 2 18:59:06 UTC 2024 x86_64 GNU/Linux
Please confirm you are running a stable release kernel (e.g. not a -rc). We do not accept bug reports for unreleased kernels.
Hardware: GPU
GPU 0: NVIDIA GeForce RTX 4090 (UUID: GPU-7748274a-4638-0708-093c-ed052c2b4537)
Describe the bug
[ 11.565746] nvidia: loading out-of-tree module taints kernel. [ 11.659723] nvidia-nvlink: Nvlink Core is being initialized, major device number 509 [ 11.660518] nvidia 0000:01:00.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=none:owns=none [ 12.030254] nvidia-modeset: Loading NVIDIA UNIX Open Kernel Mode Setting Driver for x86_64 550.76 Release Build (akmods@knut) Sat 11 May 06:40:43 CEST 2024 [ 12.035486] [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver [ 12.036183] nvidia 0000:01:00.0: Direct firmware load for nvidia/550.76/gsp_ga10x.bin failed with error -2 [ 12.036865] [drm:nv_drm_load [nvidia_drm]] ERROR [nvidia-drm] [GPU ID 0x00000100] Failed to allocate NvKmsKapiDevice [ 12.036926] [drm:nv_drm_register_drm_device [nvidia_drm]] ERROR [nvidia-drm] [GPU ID 0x00000100] Failed to register device
To Reproduce
switch to open-gpu-model reboot
Bug Incidence
Always
nvidia-bug-report.log.gz
nvidia-bug-report.log.gz
More Info
No response