DualCoder / vgpu_unlock

Unlock vGPU functionality for consumer grade GPUs.
MIT License
4.61k stars 430 forks source link

GP107GL #30

Closed zack4485 closed 3 years ago

zack4485 commented 3 years ago

I don't see any explicit support for the Quadro P400 GPU (based on the GP107GL chip)...should we assume these "lesser" variants are supported or not?

KrutavShah commented 3 years ago

We don’t have any of our own to test with, so we can’t guarantee support unless owners of that specific card are able to get it working and can verify that it does. That includes listing any modifications needed to make it work. What’s the PCI ID of your card? You may also join the discord server for more information on this.

zack4485 commented 3 years ago

I don't know the PCI ID off the top of my head; the card is currently not installed. I will test it when I find some free time...I was just hoping somebody would already know the answer and I could avoid finding out the hard way that it doesn't work!

KrutavShah commented 3 years ago

We have added a few GP107 cards, and yours may be on there. Please test it out and see if it will work with the P40-1Q profile. If it does, we will keep it, otherwise, it will be removed.

huzhifeng commented 3 years ago

My test result show that Quadro P400 does not yet support, need some additional patch. Below is the relevant information and logs.

root@pve:~# lspci -d 10de:
01:00.0 VGA compatible controller: NVIDIA Corporation GP107GL [Quadro P400] (rev a1)
01:00.1 Audio device: NVIDIA Corporation GP107GL High Definition Audio Controller (rev a1)
root@pve:~#
root@pve:~# lspci -d 10de: -n
01:00.0 0300: 10de:1cb3 (rev a1)
01:00.1 0403: 10de:0fb9 (rev a1)
root@pve:~#
root@pve:~# vi /var/log/syslog
Apr 18 10:54:59 pve kernel: [    5.345664] nvidia 0000:01:00.0: enabling device (0000 -> 0003)
Apr 18 10:54:59 pve kernel: [    5.345807] nvidia 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
Apr 18 10:54:59 pve kernel: [    5.345822] Remap called.
Apr 18 10:54:59 pve kernel: [    5.345834] Remap called.
Apr 18 10:55:00 pve nvidia-vgpud: Verbose syslog connection opened
Apr 18 10:55:00 pve nvidia-vgpud: Started (818)
Apr 18 10:55:00 pve systemd[1]: Started Login Service.
Apr 18 10:55:00 pve nvidia-vgpud: Global settings:
Apr 18 10:55:00 pve nvidia-vgpud: Size: 16#012Version 1
Apr 18 10:55:00 pve nvidia-vgpud: Homogeneous vGPUs: 1
Apr 18 10:55:00 pve nvidia-vgpud: vGPU types: 494#012
Apr 18 10:55:00 pve nvidia-vgpud:
Apr 18 10:55:00 pve nvidia-vgpud: pciId of gpu [0]: 0:1:0:0
Apr 18 10:55:00 pve kernel: [    6.436981] Remap called.
...
Apr 18 10:55:00 pve kernel: [    6.813694] NVRM: GPU at 0000:01:00.0 has software scheduler DISABLED with policy BEST_EFFORT.
Apr 18 10:55:00 pve kernel: [    6.924557] Magic found: 47 9c b6 e1 45 1a f8 4c 1a f9 48 cc 3b 9a e8 b9
Apr 18 10:55:00 pve kernel: [    6.924640] Key found: ad 63 21 fc 4c 3e 17 f8 bc 68 ad 65 05 c7 79 38
Apr 18 10:55:00 pve kernel: [    6.936432] Magic is at: ffffffffc2ed8a87
Apr 18 10:55:00 pve kernel: [    6.948450] Pointers found, magic: ffffffffc2ee72e0 blocks: ffffffffc2ee72e8 sign: ffffffffc2ee72f0
Apr 18 10:55:00 pve kernel: [    6.948453] Generate signature is: a3 11 a8 c2 62 8c 7e 2a cf c4 1d fb 87 df 32 2a 29 06 68 d1 07 01 f3 62 7d 13 ba 14 8a ba 1d e6
Apr 18 10:55:00 pve kernel: [    6.960353] Sacrificial magic is at: ffffffffc2ed8267
Apr 18 10:55:00 pve kernel: [    6.972342] Pointers found, sac_magic: ffffffffc2ee22e8 sac_blocks: ffffffffc2ee22f0 sac_sign: ffffffffc2ee22f8
Apr 18 10:55:00 pve kernel: [    6.972343] Decrypted first block is: 16 16 b3 1c 00 00 be 11 c3 15 45 49 5a 4f 20 51.
Apr 18 10:55:00 pve kernel: [    6.972345] vGPU unlock patch applied.
Apr 18 10:55:00 pve kernel: [    6.972401] vmbr0: port 1(enp2s0) entered disabled state
Apr 18 10:55:00 pve kernel: [    7.029701] Remap called.
Apr 18 10:55:00 pve nvidia-vgpu-mgr[851]: notice: vmiop_env_log: nvidia-vgpu-mgr daemon started
Apr 18 10:55:00 pve nvidia-vgpud: GPU not supported by vGPU at PCI Id: 0:1:0:0 DevID: 0x10de / 0x1cb3 / 0x10de / 0x0000
Apr 18 10:55:00 pve nvidia-vgpud: error: failed to send vGPU configuration info to RM: 6
Apr 18 10:55:00 pve nvidia-vgpud: PID file unlocked.
Apr 18 10:55:00 pve nvidia-vgpud: PID file closed.
Apr 18 10:55:00 pve nvidia-vgpud: Shutdown (818)
Apr 18 10:55:00 pve systemd[1]: nvidia-vgpud.service: Succeeded.
huzhifeng commented 3 years ago

Here is the patch for GP107GL [Quadro P400]. 0b51420 I tested and verified with proxmox 6.3-6.

syslog:

root@pve:~# vi /var/log/syslog
Apr 18 12:01:56 pve kernel: [    6.951657] Remap called.
Apr 18 12:01:56 pve kernel: [    6.951660] BAR3 mapped at: 0xFFFFBE9746000000
Apr 18 12:01:56 pve kernel: [    6.961455] vmbr0: port 1(enp2s0) entered disabled state
Apr 18 12:01:56 pve kernel: [    6.969243] NVRM: GPU at 0000:01:00.0 has software scheduler DISABLED with policy BEST_EFFORT.
Apr 18 12:01:56 pve kernel: [    7.072368] Magic found: 47 9c b6 e1 45 1a f8 4c 1a f9 48 cc 3b 9a e8 b9
Apr 18 12:01:56 pve kernel: [    7.072396] Key found: ad 63 21 fc 4c 3e 17 f8 bc 68 ad 65 05 c7 79 38
Apr 18 12:01:56 pve kernel: [    7.084184] Magic is at: ffffffffc33c7a87
Apr 18 12:01:56 pve kernel: [    7.096206] Pointers found, magic: ffffffffc33d62e0 blocks: ffffffffc33d62e8 sign: ffffffffc33d62f0
Apr 18 12:01:56 pve kernel: [    7.096209] Generate signature is: a3 11 a8 c2 62 8c 7e 2a cf c4 1d fb 87 df 32 2a 29 06 68 d1 07 01 f3 62 7d 13 ba 14 8a ba 1d e6
Apr 18 12:01:56 pve kernel: [    7.108121] Sacrificial magic is at: ffffffffc33c7267
Apr 18 12:01:56 pve kernel: [    7.120117] Pointers found, sac_magic: ffffffffc33d12e8 sac_blocks: ffffffffc33d12f0 sac_sign: ffffffffc33d12f8
Apr 18 12:01:56 pve kernel: [    7.120119] Decrypted first block is: 16 16 b3 1c 00 00 be 11 c3 15 45 49 5a 4f 20 51.
Apr 18 12:01:56 pve kernel: [    7.120156] vGPU unlock patch applied.
Apr 18 12:01:56 pve kernel: [    7.181469] Remap called.
Apr 18 12:01:56 pve iscsid: iSCSI daemon with pid=715 started!
Apr 18 12:01:57 pve nvidia-vgpud: pciId of gpu [0]: 0:1:0:0
Apr 18 12:01:57 pve nvidia-vgpu-mgr[854]: notice: vmiop_env_log: nvidia-vgpu-mgr daemon started
Apr 18 12:01:57 pve nvidia-vgpud:
Apr 18 12:01:57 pve nvidia-vgpud: Physical GPU:
Apr 18 12:01:57 pve nvidia-vgpud: PciID: 0x0000 / 0x0001 / 0x0000 / 0x0000
Apr 18 12:01:57 pve nvidia-vgpud: Size: 52#012Version 1
Apr 18 12:01:57 pve nvidia-vgpud: DevID: 0x10de / 0x1bb3 / 0x10de / 0x0000
Apr 18 12:01:57 pve nvidia-vgpud: Supported vGPUs count: 14
Apr 18 12:01:57 pve nvidia-vgpud:
Apr 18 12:01:57 pve nvidia-vgpud: Supported VGPU 0x47: max 8
Apr 18 12:01:57 pve nvidia-vgpud: VGPU Type 0x47: GRID P4-1B Class: NVS
Apr 18 12:01:57 pve nvidia-vgpud: DevId: 0x10de / 0x1bb3 / 0x10de / 0x1203#012
Apr 18 12:01:57 pve nvidia-vgpud: Framebuffer: 0x38000000
Apr 18 12:01:57 pve nvidia-vgpud: Mappable video size: 0x400000
Apr 18 12:01:57 pve nvidia-vgpud: Framebuffer reservation: 0x8000000
Apr 18 12:01:57 pve nvidia-vgpud: FRL configuration: 0x2d
Apr 18 12:01:57 pve nvidia-vgpud: CUDA enabled: 0x0
Apr 18 12:01:57 pve nvidia-vgpud: ECC supported: 0x0
Apr 18 12:01:57 pve nvidia-vgpud: Multi vGPU supported: 0x0
Apr 18 12:01:57 pve nvidia-vgpud: Encoder Capacity: 0x64
Apr 18 12:01:57 pve nvidia-vgpud: BAR1 Length: 0x100
Apr 18 12:01:57 pve nvidia-vgpud: Frame Rate Limiter enabled: 0x1
Apr 18 12:01:57 pve nvidia-vgpud: Number of Displays: 4#012
Apr 18 12:01:57 pve nvidia-vgpud: Max pixels: 16384000#012
Apr 18 12:01:57 pve nvidia-vgpud: Display: width 5120, height 2880
Apr 18 12:01:57 pve nvidia-vgpud: License: GRID-Virtual-PC,2.0;Quadro-Virtual-DWS,5.0;GRID-Virtual-WS,2.0;GRID-Virtual-WS-Ext,2.0
...
Apr 18 12:01:57 pve nvidia-vgpud: Supported VGPU 0x120: max 2
Apr 18 12:01:57 pve nvidia-vgpud: VGPU Type 0x120: GRID P4-4C Class: Compute
Apr 18 12:01:57 pve nvidia-vgpud: DevId: 0x10de / 0x1bb3 / 0x10de / 0x1385#012
Apr 18 12:01:57 pve nvidia-vgpud: Framebuffer: 0xec000000
Apr 18 12:01:57 pve nvidia-vgpud: Mappable video size: 0x400000
Apr 18 12:01:57 pve nvidia-vgpud: Framebuffer reservation: 0x14000000
Apr 18 12:01:57 pve nvidia-vgpud: FRL configuration: 0x3c
Apr 18 12:01:57 pve nvidia-vgpud: CUDA enabled: 0x1
Apr 18 12:01:57 pve nvidia-vgpud: ECC supported: 0x1
Apr 18 12:01:57 pve nvidia-vgpud: Multi vGPU supported: 0x0
Apr 18 12:01:57 pve nvidia-vgpud: Encoder Capacity: 0x64
Apr 18 12:01:57 pve nvidia-vgpud: BAR1 Length: 0x100
Apr 18 12:01:57 pve nvidia-vgpud: Frame Rate Limiter enabled: 0x1
Apr 18 12:01:57 pve nvidia-vgpud: Number of Displays: 1#012
Apr 18 12:01:57 pve nvidia-vgpud: Max pixels: 8847360#012
Apr 18 12:01:57 pve nvidia-vgpud: Display: width 4096, height 2160
Apr 18 12:01:57 pve nvidia-vgpud: License: NVIDIA-vComputeServer,9.0;Quadro-Virtual-DWS,5.0
Apr 18 12:01:57 pve kernel: [    7.406205] nvidia 0000:01:00.0: MDEV: Registered
Apr 18 12:01:57 pve nvidia-vgpud: PID file unlocked.
Apr 18 12:01:57 pve nvidia-vgpud: PID file closed.
Apr 18 12:01:57 pve nvidia-vgpud: Shutdown (855)
Apr 18 12:01:57 pve systemd[1]: nvidia-vgpud.service: Succeeded.
root@pve:~#

nvidia-smi:

root@pve:~# /root/vgpu_unlock/vgpu_unlock nvidia-smi vgpu -s
root@pve:~# GPU 00000000:01:00.0
    GRID P4-1B
    GRID P4-1Q
    GRID P4-2Q
    GRID P4-4Q
    GRID P4-8Q
    GRID P4-1A
    GRID P4-2A
    GRID P4-4A
    GRID P4-8A
    GRID P4-2B
    GRID P4-2B4
    GRID P4-1B4
    GRID P4-8C
    GRID P4-4C
root@pve:~#
root@pve:~# /root/vgpu_unlock/vgpu_unlock nvidia-smi
root@pve:~# Sun Apr 18 12:26:51 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.04    Driver Version: 460.32.04    CUDA Version: N/A      |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Quadro P400         On   | 00000000:01:00.0 Off |                  N/A |
| 47%   60C    P0    N/A /  N/A |   1011MiB /  2047MiB |    100%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      2599    C+G   vgpu                             1000MiB |
+-----------------------------------------------------------------------------+
root@pve:~#
root@pve:~# /root/vgpu_unlock/vgpu_unlock nvidia-smi vgpu
root@pve:~# Sun Apr 18 12:27:17 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.04              Driver Version: 460.32.04                 |
|---------------------------------+------------------------------+------------+
| GPU  Name                       | Bus-Id                       | GPU-Util   |
|      vGPU ID     Name           | VM ID     VM Name            | vGPU-Util  |
|=================================+==============================+============|
|   0  Quadro P400                | 00000000:01:00.0             | 100%       |
|      3251634184  GRID P4-1Q     | 0000...  win10-vgpu          |     99%    |
+---------------------------------+------------------------------+------------+
root@pve:~#

mdev:

root@pve:~# ls "/sys/bus/pci/devices/0000:01:00.0/mdev_supported_types"
nvidia-157  nvidia-243  nvidia-289  nvidia-64  nvidia-66  nvidia-68  nvidia-70
nvidia-214  nvidia-288  nvidia-63   nvidia-65  nvidia-67  nvidia-69  nvidia-71
root@pve:~#
root@pve:~# ls -l /sys/bus/mdev/devices/00000000-0000-0000-0000-000000000100
lrwxrwxrwx 1 root root 0 Apr 18 12:28 /sys/bus/mdev/devices/00000000-0000-0000-0000-000000000100 -> ../../../devices/pci0000:00/0000:00:01.0/0000:01:00.0/00000000-0000-0000-0000-000000000100
root@pve:~#

qemu:

root@pve:~# cat /etc/pve/qemu-server/100.conf
agent: 1
boot: order=scsi0;ide2;net0
cores: 2
hostpci0: 01:00.0,mdev=nvidia-63
ide1: local:iso/virtio-win-0.1.190.iso,media=cdrom,size=489986K
ide2: local:iso/cn_windows_10_business_editions_version_20h2_x64_dvd_f978664f.iso,media=cdrom
machine: pc-i440fx-5.2
memory: 2048
name: win10-vgpu
net0: virtio=76:CE:99:18:E1:10,bridge=vmbr0,firewall=1
numa: 0
ostype: win10
scsi0: local-lvm:vm-100-disk-0,cache=writeback,size=32G
scsihw: virtio-scsi-pci
smbios1: uuid=1c528cb7-9d74-4ee8-84ab-26764ea9bda1
sockets: 1
vmgenid: 9870c104-49f7-4c07-9931-de7b5dc1838c
args: -uuid 00000000-0000-0000-0000-000000000100
root@pve:~#
root@pve:~# ps -ef | grep qemu
root      2533     1 99 12:11 ?        00:23:33 /usr/bin/kvm -id 100 -name win10-vgpu -no-shutdown -chardev socket,id=qmp,path=/var/run/qemu-server/100.qmp,server,nowait -mon chardev=qmp,mode=control -chardev socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5 -mon chardev=qmp-event,mode=control -pidfile /var/run/qemu-server/100.pid -daemonize -smbios type=1,uuid=1c528cb7-9d74-4ee8-84ab-26764ea9bda1 -smp 2,sockets=1,cores=2,maxcpus=2 -nodefaults -boot menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg -vnc unix:/var/run/qemu-server/100.vnc,password -no-hpet -cpu kvm64,enforce,hv_ipi,hv_relaxed,hv_reset,hv_runtime,hv_spinlocks=0x1fff,hv_stimer,hv_synic,hv_time,hv_vapic,hv_vpindex,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep -m 2048 -device pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e -device pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f -device vmgenid,guid=9870c104-49f7-4c07-9931-de7b5dc1838c -device piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2 -device usb-tablet,id=tablet,bus=uhci.0,port=1 -device vfio-pci,sysfsdev=/sys/bus/pci/devices/0000:01:00.0/00000000-0000-0000-0000-000000000100,id=hostpci0,bus=pci.0,addr=0x10 -device VGA,id=vga,bus=pci.0,addr=0x2,edid=off -chardev socket,path=/var/run/qemu-server/100.qga,server,nowait,id=qga0 -device virtio-serial,id=qga0,bus=pci.0,addr=0x8 -device virtserialport,chardev=qga0,name=org.qemu.guest_agent.0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3 -iscsi initiator-name=iqn.1993-08.org.debian:01:7e40bdcc46a7 -drive file=/var/lib/vz/template/iso/virtio-win-0.1.190.iso,if=none,id=drive-ide1,media=cdrom,aio=threads -device ide-cd,bus=ide.0,unit=1,drive=drive-ide1,id=ide1 -drive file=/var/lib/vz/template/iso/cn_windows_10_business_editions_version_20h2_x64_dvd_f978664f.iso,if=none,id=drive-ide2,media=cdrom,aio=threads -device ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=101 -device virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5 -drive file=/dev/pve/vm-100-disk-0,if=none,id=drive-scsi0,cache=writeback,format=raw,aio=threads,detect-zeroes=on -device scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,bootindex=100 -netdev type=tap,id=net0,ifname=tap100i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on -device virtio-net-pci,mac=76:CE:99:18:E1:10,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=102 -rtc driftfix=slew,base=localtime -machine type=pc-i440fx-5.2+pve0 -global kvm-pit.lost_tick_policy=discard -uuid 00000000-0000-0000-0000-000000000100
root      5520  1717  0 12:29 pts/0    00:00:00 grep qemu
root@pve:~#
DualCoder commented 3 years ago

I'm closing this since the relevant changes has been merged.