DualCoder / vgpu_unlock

Unlock vGPU functionality for consumer grade GPUs.
MIT License
4.61k stars 430 forks source link

dkms remove -m nvidia failed :No original module was found for this module on this kernel #69

Open Arensi9801 opened 3 years ago

Arensi9801 commented 3 years ago

When I execute the following command: dkms remove -m nvidia -v 450.89 --all It reminds me : nvidia.ko:

nvidia-vgpu-vfio.ko:

depmod...

DKMS: uninstall completed.

when i install it,I still can’t use vgpu in my 2080s. How can i fix it ?

Arensi9801 commented 3 years ago

when i install it reminds me:

DKMS: build completed.

nvidia.ko: Running module version sanity check.

nvidia-vgpu-vfio.ko: Running module version sanity check.

depmod...

DKMS: install completed.

DualCoder commented 3 years ago

Well, it seems like DKMS is able to build the module: DKMS: build completed.

I don't know the exact paths for PVE, but you could do a sanity check like:

sha256sum /var/lib/dkms/nvidia/450.89/5.4.34-1-pve/x86_64/module/*
sha256sum /lib/modules/5.4.34-1-pve/updates/dkms/*

These should produce the same checksums, otherwise there was some error during installation. If they do match then the kernel module was installed, and the error is somewhere else.

Arensi9801 commented 3 years ago

oh!Thank you!After I tried the method you provided, the results showed that these modules did have a normal installation,。The subsequent normal use of vgpu,after I restarted nvidia-vgpu-mgr.service.

Arensi9801 commented 3 years ago

Curiously, I can't do Q-type slicing right now, but I can do A-type slicing. This is my nvidia-vgpu-mgr.service log. root@kr:~# journalctl -u nvidia-vgpu-mgr -- Logs begin at Fri 2021-08-20 18:37:36 CST, end at Mon 2021-08-23 10:00:00 CST. -- Aug 20 18:37:40 kr systemd[1]: Starting NVIDIA vGPU Manager Daemon... Aug 20 18:37:40 kr systemd[1]: Started NVIDIA vGPU Manager Daemon. Aug 20 18:37:40 kr bash[1394]: vgpu_unlock loaded. Aug 20 18:37:40 kr nvidia-vgpu-mgr[1394]: vgpu_unlock loaded. Aug 20 18:37:40 kr nvidia-vgpu-mgr[1428]: vgpu_unlock loaded. Aug 20 18:37:41 kr nvidia-vgpu-mgr[1428]: notice: vmiop_env_log: nvidia-vgpu-mgr daemon started Aug 20 18:39:07 kr nvidia-vgpu-mgr[2279]: vgpu_unlock loaded. Aug 20 18:39:07 kr nvidia-vgpu-mgr[2291]: vgpu_unlock loaded. Aug 20 18:39:07 kr nvidia-vgpu-mgr[2291]: notice: vmiop_env_log: vmiop-env: guest_max_gpfn:0x0 Aug 20 18:39:07 kr nvidia-vgpu-mgr[2291]: notice: vmiop_env_log: (0x0): Received start call from nvidia-vgpu-vfio module: mdev uuid 00000000-0000-0000-0000-000000000103 GPU PCI id 00:4b:00.0 config params vgpu_type_id=440 Aug 20 18:39:07 kr nvidia-vgpu-mgr[2291]: notice: vmiop_env_log: (0x0): pluginconfig: vgpu_type_id=440 Aug 20 18:39:07 kr nvidia-vgpu-mgr[2291]: notice: vmiop_env_log: Successfully updated env symbols! Aug 20 18:39:07 kr nvidia-vgpu-mgr[2291]: op_type: 0x20801322 failed. Aug 20 18:39:07 kr nvidia-vgpu-mgr[2291]: op_type: 0x2080014b failed. Aug 20 18:39:07 kr nvidia-vgpu-mgr[2291]: notice: vmiop_log: (0x0): gpu-pci-id : 0x4b00 Aug 20 18:39:07 kr nvidia-vgpu-mgr[2291]: notice: vmiop_log: (0x0): vgpu_type : NVS Aug 20 18:39:07 kr nvidia-vgpu-mgr[2291]: notice: vmiop_log: (0x0): Framebuffer: 0xec000000 Aug 20 18:39:07 kr nvidia-vgpu-mgr[2291]: notice: vmiop_log: (0x0): Virtual Device Id: 0x1e30:0x143c Aug 20 18:39:07 kr nvidia-vgpu-mgr[2291]: notice: vmiop_log: (0x0): FRL Value: 60 FPS Aug 20 18:39:07 kr nvidia-vgpu-mgr[2291]: notice: vmiop_log: ######## vGPU Manager Information: ######## Aug 20 18:39:07 kr nvidia-vgpu-mgr[2291]: notice: vmiop_log: Driver Version: 450.89 Aug 20 18:39:07 kr nvidia-vgpu-mgr[2291]: op_type: 0x2080012f failed. Aug 20 18:39:07 kr nvidia-vgpu-mgr[2291]: notice: vmiop_log: (0x0): Init frame copy engine: syncing... Aug 20 18:39:07 kr nvidia-vgpu-mgr[2291]: notice: vmiop_log: (0x0): vGPU migration disabled Aug 20 18:39:07 kr nvidia-vgpu-mgr[2291]: notice: vmiop_log: display_init inst: 0 successful Aug 20 18:39:35 kr nvidia-vgpu-mgr[2291]: notice: vmiop_log: ######## Guest NVIDIA Driver Information: ######## Aug 20 18:39:35 kr nvidia-vgpu-mgr[2291]: notice: vmiop_log: Driver Version: 462.96 Aug 20 18:39:35 kr nvidia-vgpu-mgr[2291]: error: vmiop_log: (0x0): Incompatible Guest/Host drivers: Guest VGX version is newer than the maximum version supported by the Host. Disabling vGPU. Aug 20 18:39:35 kr nvidia-vgpu-mgr[2291]: error: vmiop_log: (0x0): VGPU message 1 failed, result code: 0x6a Aug 20 18:39:35 kr nvidia-vgpu-mgr[2291]: error: vmiop_log: (0x0): 0x1a, 0x24, 0x100, 0x100, Aug 20 18:39:35 kr nvidia-vgpu-mgr[2291]: error: vmiop_log: (0x0): 0x100, 0x0, '462.96', 'r462_84-4', 'DVSReal r462_84 462.96 DVS-Applications' Aug 20 18:39:35 kr nvidia-vgpu-mgr[2291]: error: vmiop_log: (0x0): VGPU message 47 failed, guest VGX version not initialized... Aug 20 18:39:35 kr nvidia-vgpu-mgr[2291]: error: vmiop_log: (0x0): VGPU message 47 failed, result code: 0x56 Aug 20 18:41:57 kr nvidia-vgpu-mgr[2835]: vgpu_unlock loaded. Aug 20 18:41:57 kr nvidia-vgpu-mgr[2847]: vgpu_unlock loaded. Aug 20 18:41:57 kr nvidia-vgpu-mgr[2847]: notice: vmiop_env_log: vmiop-env: guest_max_gpfn:0x0 Aug 20 18:41:57 kr nvidia-vgpu-mgr[2847]: notice: vmiop_env_log: (0x0): Received start call from nvidia-vgpu-vfio module: mdev uuid 00000000-0000-0000-0000-000000000107 GPU PCI id 00:4b:00.0 config params vgpu_type_id=440 Aug 20 18:41:57 kr nvidia-vgpu-mgr[2847]: notice: vmiop_env_log: (0x0): pluginconfig: vgpu_type_id=440 Aug 20 18:41:57 kr nvidia-vgpu-mgr[2847]: notice: vmiop_env_log: Successfully updated env symbols! Aug 20 18:41:57 kr nvidia-vgpu-mgr[2847]: op_type: 0x20801322 failed. Aug 20 18:41:57 kr nvidia-vgpu-mgr[2847]: op_type: 0x2080014b failed. Aug 20 18:41:57 kr nvidia-vgpu-mgr[2847]: notice: vmiop_log: (0x0): gpu-pci-id : 0x4b00 Aug 20 18:41:57 kr nvidia-vgpu-mgr[2847]: notice: vmiop_log: (0x0): vgpu_type : NVS Aug 20 18:41:57 kr nvidia-vgpu-mgr[2847]: notice: vmiop_log: (0x0): Framebuffer: 0xec000000 Aug 20 18:41:57 kr nvidia-vgpu-mgr[2847]: notice: vmiop_log: (0x0): Virtual Device Id: 0x1e30:0x143c Aug 20 18:41:57 kr nvidia-vgpu-mgr[2847]: notice: vmiop_log: (0x0): FRL Value: 60 FPS Aug 20 18:41:57 kr nvidia-vgpu-mgr[2847]: notice: vmiop_log: ######## vGPU Manager Information: ######## Aug 20 18:41:57 kr nvidia-vgpu-mgr[2847]: notice: vmiop_log: Driver Version: 450.89 Aug 20 18:41:57 kr nvidia-vgpu-mgr[2847]: op_type: 0x2080012f failed. Aug 20 18:41:57 kr nvidia-vgpu-mgr[2847]: notice: vmiop_log: (0x0): Init frame copy engine: syncing... Aug 20 18:41:57 kr nvidia-vgpu-mgr[2847]: notice: vmiop_log: (0x0): vGPU migration disabled Aug 20 18:41:57 kr nvidia-vgpu-mgr[2847]: notice: vmiop_log: display_init inst: 0 successful Aug 20 18:42:22 kr nvidia-vgpu-mgr[2847]: notice: vmiop_log: ######## Guest NVIDIA Driver Information: ######## Aug 20 18:42:22 kr nvidia-vgpu-mgr[2847]: notice: vmiop_log: Driver Version: 462.96 Aug 20 18:42:22 kr nvidia-vgpu-mgr[2847]: error: vmiop_log: (0x0): Incompatible Guest/Host drivers: Guest VGX version is newer than the maximum version supported by the Host. Disabling vGPU. Aug 20 18:42:22 kr nvidia-vgpu-mgr[2847]: error: vmiop_log: (0x0): VGPU message 1 failed, result code: 0x6a Aug 20 18:42:22 kr nvidia-vgpu-mgr[2847]: error: vmiop_log: (0x0): 0x1a, 0x24, 0x100, 0x100, Aug 20 18:42:22 kr nvidia-vgpu-mgr[2847]: error: vmiop_log: (0x0): 0x100, 0x0, '462.96', 'r462_84-4', 'DVSReal r462_84 462.96 DVS-Applications' Aug 20 18:42:22 kr nvidia-vgpu-mgr[2847]: error: vmiop_log: (0x0): VGPU message 47 failed, guest VGX version not initialized... Aug 20 18:42:22 kr nvidia-vgpu-mgr[2847]: error: vmiop_log: (0x0): VGPU message 47 failed, result code: 0x56 Aug 20 18:43:56 kr nvidia-vgpu-mgr[2291]: notice: vmiop_log: ######## Guest NVIDIA Driver Information: ######## Aug 20 18:43:56 kr nvidia-vgpu-mgr[2291]: notice: vmiop_log: Driver Version: 442.92 Aug 20 18:43:56 kr nvidia-vgpu-mgr[2291]: notice: vmiop_log: vGPU version: 0x50001 Aug 20 18:43:56 kr nvidia-vgpu-mgr[2291]: notice: vmiop_log: (0x0): Current max guest pfn = 0xc58a53! Aug 20 18:43:57 kr nvidia-vgpu-mgr[2291]: notice: vmiop_log: (0x0): Current max guest pfn = 0xcfba34! Aug 20 18:44:00 kr nvidia-vgpu-mgr[2291]: notice: vmiop_log: (0x0): Current max guest pfn = 0xcfffdc! Aug 20 18:45:17 kr nvidia-vgpu-mgr[2847]: notice: vmiop_log: ######## Guest NVIDIA Driver Information: ######## Aug 20 18:45:17 kr nvidia-vgpu-mgr[2847]: notice: vmiop_log: Driver Version: 461.33 Aug 20 18:45:17 kr nvidia-vgpu-mgr[2847]: error: vmiop_log: (0x0): Incompatible Guest/Host drivers: Guest VGX version is newer than the maximum version supported by the Host. Disabling vGPU. Aug 20 18:45:17 kr nvidia-vgpu-mgr[2847]: error: vmiop_log: (0x0): VGPU message 1 failed, result code: 0x6a Aug 20 18:45:17 kr nvidia-vgpu-mgr[2847]: error: vmiop_log: (0x0): 0x1a, 0x18, 0x100, 0x100, Aug 20 18:45:17 kr nvidia-vgpu-mgr[2847]: error: vmiop_log: (0x0): 0x100, 0x0, '461.33', 'r460_94-9', 'DVSReal r460_94 461.33 DVS-Applications' Aug 20 18:45:17 kr nvidia-vgpu-mgr[2847]: error: vmiop_log: (0x0): VGPU message 47 failed, guest VGX version not initialized... Aug 20 18:45:17 kr nvidia-vgpu-mgr[2847]: error: vmiop_log: (0x0): VGPU message 47 failed, result code: 0x56 Aug 20 18:50:31 kr nvidia-vgpu-mgr[2847]: notice: vmiop_log: ######## Guest NVIDIA Driver Information: ######## Aug 20 18:50:31 kr nvidia-vgpu-mgr[2847]: notice: vmiop_log: Driver Version: 442.92 Aug 20 18:50:31 kr nvidia-vgpu-mgr[2847]: notice: vmiop_log: vGPU version: 0x50001 Aug 20 18:50:31 kr nvidia-vgpu-mgr[2847]: notice: vmiop_log: (0x0): Current max guest pfn = 0x1de35c! Aug 20 18:50:31 kr nvidia-vgpu-mgr[2847]: notice: vmiop_log: (0x0): Current max guest pfn = 0xcfb81b! Aug 20 18:50:35 kr nvidia-vgpu-mgr[2847]: notice: vmiop_log: (0x0): Current max guest pfn = 0xcfffdc! Aug 20 20:57:16 kr nvidia-vgpu-mgr[33626]: vgpu_unlock loaded. Aug 20 20:57:16 kr nvidia-vgpu-mgr[33638]: vgpu_unlock loaded. Aug 20 20:57:16 kr nvidia-vgpu-mgr[33638]: notice: vmiop_env_log: vmiop-env: guest_max_gpfn:0x0 Aug 20 20:57:16 kr nvidia-vgpu-mgr[33638]: notice: vmiop_env_log: (0x0): Received start call from nvidia-vgpu-vfio module: mdev uuid 00000000-0000-0000-0000-000000000107 GPU PCI id 00:4b:00.0 config params vgpu_type_id=440 Aug 20 20:57:16 kr nvidia-vgpu-mgr[33638]: notice: vmiop_env_log: (0x0): pluginconfig: vgpu_type_id=440 Aug 20 20:57:16 kr nvidia-vgpu-mgr[33638]: notice: vmiop_env_log: Successfully updated env symbols! Aug 20 20:57:16 kr nvidia-vgpu-mgr[33638]: op_type: 0x20801322 failed. Aug 20 20:57:16 kr nvidia-vgpu-mgr[33638]: op_type: 0x2080014b failed. Aug 20 20:57:16 kr nvidia-vgpu-mgr[33638]: notice: vmiop_log: (0x0): gpu-pci-id : 0x4b00 Aug 20 20:57:16 kr nvidia-vgpu-mgr[33638]: notice: vmiop_log: (0x0): vgpu_type : NVS Aug 20 20:57:16 kr nvidia-vgpu-mgr[33638]: notice: vmiop_log: (0x0): Framebuffer: 0xec000000 Aug 20 20:57:16 kr nvidia-vgpu-mgr[33638]: notice: vmiop_log: (0x0): Virtual Device Id: 0x1e30:0x143c Aug 20 20:57:16 kr nvidia-vgpu-mgr[33638]: notice: vmiop_log: (0x0): FRL Value: 60 FPS Aug 20 20:57:16 kr nvidia-vgpu-mgr[33638]: notice: vmiop_log: ######## vGPU Manager Information: ######## Aug 20 20:57:16 kr nvidia-vgpu-mgr[33638]: notice: vmiop_log: Driver Version: 450.89 Aug 20 20:57:16 kr nvidia-vgpu-mgr[33638]: op_type: 0x2080012f failed. Aug 20 20:57:16 kr nvidia-vgpu-mgr[33638]: notice: vmiop_log: (0x0): Init frame copy engine: syncing... Aug 20 20:57:16 kr nvidia-vgpu-mgr[33638]: notice: vmiop_log: (0x0): vGPU migration disabled Aug 20 20:57:16 kr nvidia-vgpu-mgr[33638]: notice: vmiop_log: display_init inst: 0 successful Aug 20 20:57:26 kr nvidia-vgpu-mgr[33638]: notice: vmiop_log: ######## Guest NVIDIA Driver Information: ######## Aug 20 20:57:26 kr nvidia-vgpu-mgr[33638]: notice: vmiop_log: Driver Version: 442.92 Aug 20 20:57:26 kr nvidia-vgpu-mgr[33638]: notice: vmiop_log: vGPU version: 0x50001 Aug 20 20:57:26 kr nvidia-vgpu-mgr[33638]: notice: vmiop_log: (0x0): Current max guest pfn = 0xc9f257! Aug 20 20:57:26 kr nvidia-vgpu-mgr[33638]: notice: vmiop_log: (0x0): Current max guest pfn = 0xcfba2d! Aug 20 20:57:29 kr nvidia-vgpu-mgr[33638]: notice: vmiop_log: (0x0): Current max guest pfn = 0xcfbb30! Aug 20 20:57:29 kr nvidia-vgpu-mgr[33638]: notice: vmiop_log: (0x0): Current max guest pfn = 0xcfffdc! lines 54-107/107 (END)

Arensi9801 commented 3 years ago

This is my nvidia-vgpud.service log root@kr:~# journalctl -u nvidia-vgpud.service -- Logs begin at Fri 2021-08-20 18:37:36 CST, end at Mon 2021-08-23 09:48:18 CST. -- Aug 20 18:37:40 kr systemd[1]: Starting NVIDIA vGPU Daemon... Aug 20 18:37:40 kr systemd[1]: Started NVIDIA vGPU Daemon. Aug 20 18:37:40 kr bash[1393]: vgpu_unlock loaded. Aug 20 18:37:40 kr nvidia-vgpud[1393]: vgpu_unlock loaded. Aug 20 18:37:40 kr nvidia-vgpud[1430]: vgpu_unlock loaded. Aug 20 18:37:40 kr nvidia-vgpud[1430]: Verbose syslog connection opened Aug 20 18:37:40 kr nvidia-vgpud[1430]: Started (1430) Aug 20 18:37:40 kr nvidia-vgpud[1430]: Global settings: Aug 20 18:37:40 kr nvidia-vgpud[1430]: Size: 16 Version 1 Aug 20 18:37:40 kr nvidia-vgpud[1430]: Homogeneous vGPUs: 1 Aug 20 18:37:40 kr nvidia-vgpud[1430]: vGPU types: 401 Aug 20 18:37:40 kr nvidia-vgpud[1430]: Aug 20 18:37:41 kr nvidia-vgpud[1430]: pciId of gpu [0]: 0:4b:0:0 Aug 20 18:37:41 kr nvidia-vgpud[1430]: Aug 20 18:37:41 kr nvidia-vgpud[1430]: Physical GPU: Aug 20 18:37:41 kr nvidia-vgpud[1430]: PciID: 0x0000 / 0x004b / 0x0000 / 0x0000 Aug 20 18:37:41 kr nvidia-vgpud[1430]: Size: 52 Version 1 Aug 20 18:37:41 kr nvidia-vgpud[1430]: DevID: 0x10de / 0x1e30 / 0x10de / 0x12ba Aug 20 18:37:41 kr nvidia-vgpud[1430]: Supported vGPUs count: 23 Aug 20 18:37:41 kr nvidia-vgpud[1430]: Aug 20 18:37:41 kr nvidia-vgpud[1430]: Supported VGPU 0x100: max 24 Aug 20 18:37:41 kr nvidia-vgpud[1430]: VGPU Type 0x100: GRID RTX6000-1Q Class: Quadro Aug 20 18:37:41 kr nvidia-vgpud[1430]: DevId: 0x10de / 0x1e30 / 0x10de / 0x1325 Aug 20 18:37:41 kr nvidia-vgpud[1430]: Framebuffer: 0x38000000 Aug 20 18:37:41 kr nvidia-vgpud[1430]: Mappable video size: 0x400000 Aug 20 18:37:41 kr nvidia-vgpud[1430]: Framebuffer reservation: 0x8000000 Aug 20 18:37:41 kr nvidia-vgpud[1430]: FRL configuration: 0x3c Aug 20 18:37:41 kr nvidia-vgpud[1430]: CUDA enabled: 0x1 Aug 20 18:37:41 kr nvidia-vgpud[1430]: ECC supported: 0x1 Aug 20 18:37:41 kr nvidia-vgpud[1430]: Multi vGPU supported: 0x0

Arensi9801 commented 3 years ago

Everything is fine with A-type sliced graphics cards, but I can't make Q-type slices. Although it has little impact on use at the moment, I want to know why.

DualCoder commented 3 years ago

but I can't make Q-type slices

I don't understand what you mean, the only think I see in the logs is the A-type working after you installed a compatible guest driver. Please provide more details, for example:

sonnh-uit commented 1 year ago

Hello everyone,

When I install new nvidia driver version 510.73.06 again with dkms, I met this error.

nvidia.ko.xz:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/3.10.0-1160.71.1.el7.x86_64/extra/

nvidia-vgpu-vfio.ko.xz:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/3.10.0-1160.71.1.el7.x86_64/extra/
Adding any weak-modules
depmod: ERROR: fstatat(4, nvidia-drm.ko.xz): No such file or directory
depmod: ERROR: fstatat(4, nvidia-modeset.ko.xz): No such file or directory
depmod: ERROR: fstatat(4, nvidia-uvm.ko.xz): No such file or directory
depmod: ERROR: fstatat(4, nvidia-drm.ko.xz): No such file or directory
depmod: ERROR: fstatat(4, nvidia-modeset.ko.xz): No such file or directory
depmod: ERROR: fstatat(4, nvidia-uvm.ko.xz): No such file or directory
depmod...

I located to folder /lib/modules/3.10.0-1160.71.1.el7.x86_64/extra/, I do not see nvidia-drm.ko.xz or any file above. Result look like

[root@sonnh-uit-lab vgpu_unlock]# ls /lib/modules/3.10.0-1160.71.1.el7.x86_64/extra/
nvidia.ko.xz  nvidia-vgpu-vfio.ko.xz  sysdig-probe.ko.xz

Have anyone met this error before? Please guide me to fix it.