wvthoog / proxmox-vgpu-installer

106 stars 27 forks source link

Restart Host Every 2 days? #10

Closed awptechnologies closed 3 months ago

awptechnologies commented 3 months ago

I just want to make sure im correct. If i install this do i have to reboot my host every 2 days? and if so is there anyway around this? My server is the core of everything in my home and restarting every 2 days may cause issues. I swear i read something about 90 days somewhere. Sorry for the post in issues, i didn't see a discussion section.

wvthoog commented 3 months ago

It's a licensing issue most likely, and is now incorporated in the script. Just choose the licensing from the menu and run the script it generates in the VM (either Windows or Linux)

Make sure your timezone (time) settings are the same between Proxmox and your VM's

awptechnologies commented 3 months ago

So I won't have to restart host?

wvthoog commented 3 months ago

Nope, just use cronjob or scheduled task to renew the license within 90 days

awptechnologies commented 3 months ago

running script now where it asked if i wanted to passthrough other gpus i selected no since i already have this done. Is that correct?

wvthoog commented 3 months ago

just use the license option only, it will install a FastAPI-DLS Docker container and that will spit out the config you need to run in your hosts.

Am working on making this more user friendly, but for now this works just as good

awptechnologies commented 3 months ago

for p4000 which mdev type do i run if i want 4 vms with 2gb a piece i tried the nvidia 50 but only one vm starts. I dont see one with a framebuffer of 2048 and 4 instances. It acts like a have a card with 24gb of ram.

wvthoog commented 3 months ago

Consult the nvidia logs

journalctl -u nvidia-vgpud.service -n 100 journalctl -u nvidia-vgpu-mgr.service -n 100

awptechnologies commented 3 months ago

everything is related to a p40 im guessing a tesla p40 but i dont have that i have a quadro p4000. All kinds of errors lol.

May 17 14:51:03 Earth nvidia-vgpu-mgr[60188]: cmd: 0x2080012f failed. May 17 14:51:03 Earth nvidia-vgpu-mgr[60188]: notice: vmiop_log: (0x0): Cannot query ECC status. vGPU ECC support will be disabled. May 17 14:51:03 Earth nvidia-vgpu-mgr[60188]: error: vmiop_log: NVOS status 0x51 May 17 14:51:03 Earth nvidia-vgpu-mgr[60188]: error: vmiop_log: Assertion Failed at 0xa46e6e1:143 May 17 14:51:03 Earth nvidia-vgpu-mgr[60188]: error: vmiop_log: 13 frames returned by backtrace May 17 14:51:03 Earth nvidia-vgpu-mgr[60188]: error: vmiop_log: /lib/x86_64-linux-gnu/libnvidia-vgpu.so(_nv009120vgpu+0x35) [0x7d190a4c5385] May 17 14:51:03 Earth nvidia-vgpu-mgr[60188]: error: vmiop_log: /lib/x86_64-linux-gnu/libnvidia-vgpu.so(_nv009170vgpu+0x14e) [0x7d190a46d8de] May 17 14:51:03 Earth nvidia-vgpu-mgr[60188]: error: vmiop_log: /lib/x86_64-linux-gnu/libnvidia-vgpu.so(_nv009254vgpu+0xe1) [0x7d190a46e6e1] May 17 14:51:03 Earth nvidia-vgpu-mgr[60188]: error: vmiop_log: /lib/x86_64-linux-gnu/libnvidia-vgpu.so(+0x875df) [0x7d190a4875df] May 17 14:51:03 Earth nvidia-vgpu-mgr[60188]: error: vmiop_log: /lib/x86_64-linux-gnu/libnvidia-vgpu.so(+0x89dac) [0x7d190a489dac] May 17 14:51:03 Earth nvidia-vgpu-mgr[60188]: error: vmiop_log: vgpu(+0x1942e) [0x63ab04a1942e] May 17 14:51:03 Earth nvidia-vgpu-mgr[60188]: error: vmiop_log: vgpu(+0x1a529) [0x63ab04a1a529] May 17 14:51:03 Earth nvidia-vgpu-mgr[60188]: error: vmiop_log: vgpu(+0x14238) [0x63ab04a14238] May 17 14:51:03 Earth nvidia-vgpu-mgr[60188]: error: vmiop_log: vgpu(+0x11a66) [0x63ab04a11a66] May 17 14:51:03 Earth nvidia-vgpu-mgr[60188]: error: vmiop_log: vgpu(+0x3eaa) [0x63ab04a03eaa] May 17 14:51:03 Earth nvidia-vgpu-mgr[60188]: error: vmiop_log: /lib/x86_64-linux-gnu/libc.so.6(+0x2724a) [0x7d190ab8424a] May 17 14:51:03 Earth nvidia-vgpu-mgr[60188]: error: vmiop_log: /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x85) [0x7d190ab84305] May 17 14:51:03 Earth nvidia-vgpu-mgr[60188]: error: vmiop_log: vgpu(+0x3eed) [0x63ab04a03eed] May 17 14:51:03 Earth nvidia-vgpu-mgr[60188]: error: vmiop_log: (0x0): Failed to alloc guest FB memory May 17 14:51:03 Earth nvidia-vgpu-mgr[60188]: error: vmiop_log: (0x0): init_device_instance failed for inst 0 with error 2 (vmiop-display: error alloc> May 17 14:51:03 Earth nvidia-vgpu-mgr[60188]: error: vmiop_log: (0x0): Initialization: init_device_instance failed error 2 May 17 14:51:03 Earth nvidia-vgpu-mgr[60188]: error: vmiop_log: display_init failed for inst: 0 May 17 14:51:03 Earth nvidia-vgpu-mgr[60188]: error: vmiop_env_log: (0x0): vmiope_process_configuration: plugin registration error May 17 14:51:03 Earth nvidia-vgpu-mgr[60188]: error: vmiop_env_log: (0x0): vmiope_process_configuration failed with 0x1a May 17 14:53:44 Earth nvidia-vgpu-mgr[49921]: notice: vmiop_env_log: (0x0): Plugin migration stage change none -> stop_and_copy. QEMU migration state:> May 17 14:53:47 Earth nvidia-vgpu-mgr[49921]: notice: vmiop_log: Stopping all vGPU migration threads May 17 15:04:49 Earth nvidia-vgpu-mgr[26573]: Nv0000CtrlVgpuGetStartDataParams { mdev_uuid: {00000000-0000-0000-0000-000000000113}, config_params: "vgpu_type_id=156", qemu_pid: 89402, gpu_pci_id: 0x400, vgpu_id: 1, gpu_pci_bdf: 1024,

wvthoog commented 3 months ago

Yeah, that's not right. Contact me directly through my website so that i login to your system and document the changes

awptechnologies commented 3 months ago

i added you on discord