GPU Passthrough - Githubissues

Joly0 commented 10 months ago

Hey, would like to know if this container is capable of passing through a gpu to the vm inside the container. I have looked into the upstream docker container qemus/qemu-docker, which seems to have some logic to work with gpu passthrough, though some documentation for this here would be great, if it is possible.

I also tried to connect to the container using virtual machine manager, but unfortunately i wasnt able to connect to it. Any idea why?

kroese commented 10 months ago

To pass-through your Intel iGPU, add the following lines to your compose file:

environment:
  GPU: "Y"
devices:
  - /dev/dri

However, this feature is mainly so that you can transcode video files in Linux using hardware acceleration. I do not know if it also works as a Display Adaptor (for accelerating the desktop) and I never tried it in Windows, so it might not work at all. But if you want to test it, go ahead!

I don't know what you mean by Virtual Machine Manager? If you mean the package that Synology provides on NAS, then that seems normal, as it only connects to its own VM's, not any random QEMU VM. See also my other project: https://github.com/vdsm/virtual-dsm for running Synology DSM in Docker. If you mean something else, then please provide some more details about what you were trying to do.

Joly0 commented 10 months ago

I meant this project https://virt-manager.org/ though it requires ssh on the client (in this case the windows-in-docker container) to connect to it.

Also, i dont have an intel gpu, just an amd igpu and an nvidia dgpu. I thought maybe it would be possible to passthrough the nvidia one so it could be used as an output device

kroese commented 10 months ago

You can SSH to the container, if you do something like:

    environment:
      HOST_PORTS: "22"
    ports:
      - 22:22

But I have no experience with Virt-Manager, so I cannot say if that works. I assume not, because it seems related to libvirt and virsh. And my container uses QEMU directly, without any help of virsh.

As for the nVidia GPU, I am sure it is possible to do it. But its kinda complicated because it needs two pass through both Docker and QEMU. Unfortunately I don't have any nVidia or AMD gpu myself, so someone else has to submit the code, because I have no way to test it.

Joly0 commented 10 months ago

Hm, i already tried forwording port 22, but i couldnt connect to the container with ssh. It seems to me like openssh-server is missing, but even then it doesnt work somehow.

If you could give me some information on how i would get started to passthrough an nvidia gpu, i could try it myself and provide the code afterwards

kroese commented 10 months ago

The container forwards all the traffic to the VM, and Windows does not respond on port 22.

So this HOST_PORTS: "22" is really important to prevent the container from forwarding that port to Windows. Just ports: - 22:22 is not enough in this case. You can get a bash shell via Docker into the container, and run something like apt-get install openssh-server if needed. But I am not sure if its worth the effort as most like virt-manager will not be able to find virsh even if port 22 is open.

As for getting the passthrough to work: you can add additional QEMU parameters via the ARGUMENTS= variable. So I would Google for terms like QEMU+NVidia+passthrough and see if you can find the correct parameters. Then put them in ARGUMENTS and see what effect they have until you discover the correct ones.

kingkunta88 commented 10 months ago

Any advice on passing through an nvidia gpu?

domrockt commented 10 months ago

Any advice on passing through an nvidia gpu?

i can test tommorow but it should be for Unraid

in extra params: --runtime=nvidia

new VariableName: Nvidia GPU UUID and Key: NVIDIA_VISIBLE_DEVICES and Value: all <--- or if you have more than one NVidia GPU the GPU UUID

domrockt commented 10 months ago

To pass-through your Intel iGPU, add the following lines to your compose file:
environment:
  GPU: "Y"
devices:
  - /dev/dri
However, this feature is mainly so that you can transcode video files in Linux using hardware acceleration. I do not know if it also works as a Display Adaptor (for accelerating the desktop) and I never tried it in Windows, so it might not work at all. But if you want to test it, go ahead!

I don't know what you mean by Virtual Machine Manager? If you mean the package that Synology provides on NAS, then that seems normal, as it only connects to its own VM's, not any random QEMU VM. See also my other project: https://github.com/vdsm/virtual-dsm for running Synology DSM in Docker. If you mean something else, then please provide some more details about what you were trying to do.

not working with Unraid.

extra Param: --device='/dev/dri' <--- does not work

AND

new device: /dev/dri/

Joly0 commented 10 months ago

Any advice on passing through an nvidia gpu?

i can test tommorow but it should be for Unraid

in extra params: --runtime=nvidia

new VariableName: Nvidia GPU UUID and Key: NVIDIA_VISIBLE_DEVICES and Value: all <--- or if you have more than one NVidia GPU the GPU UUID

I think this only passes the nvidia gpu capabilities to the container, not to the vm, but i might be wrong

Allram commented 10 months ago

Any advice on passing through an nvidia gpu?

i can test tommorow but it should be for Unraid in extra params: --runtime=nvidia new VariableName: Nvidia GPU UUID and Key: NVIDIA_VISIBLE_DEVICES and Value: all <--- or if you have more than one NVidia GPU the GPU UUID

I think this only passes the nvidia gpu capabilities to the container, not to the vm, but i might be wrong

That's correct. Tried this now and my VM does not see the GPU. It works fine for containers like Plex etc, so it must be something in the "link" between the docker-container and vm.

ladrive commented 10 months ago

To pass-through your Intel iGPU, add the following lines to your compose file:
environment:
  GPU: "Y"
devices:
  - /dev/dri
However, this feature is mainly so that you can transcode video files in Linux using hardware acceleration. I do not know if it also works as a Display Adaptor (for accelerating the desktop) and I never tried it in Windows, so it might not work at all. But if you want to test it, go ahead! I don't know what you mean by Virtual Machine Manager? If you mean the package that Synology provides on NAS, then that seems normal, as it only connects to its own VM's, not any random QEMU VM. See also my other project: https://github.com/vdsm/virtual-dsm for running Synology DSM in Docker. If you mean something else, then please provide some more details about what you were trying to do.
not working with Unraid.

extra Param: --device='/dev/dri' <--- does not work

AND

new device: /dev/dri/

Almost there (nor not).

Using unraid with a Intel iGPU (13g Intel UDH770), modifying the template to include 2 new entries (device and variable).

Apparently, everything is detected and drivers installed.

Inside the VM nothing is detected or showing indications

Can you help with the next step (if possible) ?

kroese commented 10 months ago

It totally depends on what you are trying to achieve. In the screenshot I see the GPU adapter in Windows device manager, so that means it works and everything is okay.

Its a virtual graphic cards that can be used for hardware acceleration when encoding video formats for example, or running certain calculations. All these tasks will be performed by your Intel GPU through this virtual device.

But if your goal is to use the HDMI display out to connect a monitor, I do not think this graphics card is fit for that purpose. So it all depends on what you are trying to do?

domrockt commented 10 months ago

It totally depends on what you are trying to achieve. In the screenshot I see the GPU adapter in Windows device manager, so that means it works and everything is okay.

Its a virtual graphic cards that can be used for hardware acceleration when encoding video formats for example, or running certain calculations. All these tasks will be performed by your Intel GPU through this virtual device.

But if your goal is to use the HDMI display out to connect a monitor, I do not think this graphics card is fit for that purpose. So it all depends on what you are trying to do?

i gues.. he is right.. iam in steam link right now... downloading a small game and test it out.. Intel IGPU 13700k. I test it with Pacify it should run fine.

i stream from my Unraid server to my Iphone 15pro max over wifi 5

so No the Game needs a DirectX Device which is not installed :D

kroese commented 10 months ago

@domrockt Its possible that this "GPU DOD" device has no DirectX. QEMU supports many different video devices, and this device is for transcoding videos, so obviously we need to tell QEMU to create a different device that is more suitable for gaming.

I will see if I can fix it, but its a bit low on my priority-list so if somebody else has the time to figure out how to do it in QEMU it would be appreciated.

domrockt commented 10 months ago

@domrockt Its possible that this "GPU DOD" device has no DirectX. QEMU supports many different video devices, and this device is for transcoding videos, so obviously we need to tell QEMU to create a different device that is more suitable for gaming.

I will see if I can fix it, but its a bit low on my priority-list so if somebody else has the time to figure out how to do it in QEMU it would be appreciated.

i gues this one https://github.com/virtio-win/kvm-guest-drivers-windows but ths is my wits end :D for now.

ladrive commented 10 months ago

It totally depends on what you are trying to achieve. In the screenshot I see the GPU adapter in Windows device manager, so that means it works and everything is okay.

Its a virtual graphic cards that can be used for hardware acceleration when encoding video formats for example, or running certain calculations. All these tasks will be performed by your Intel GPU through this virtual device.

But if your goal is to use the HDMI display out to connect a monitor, I do not think this graphics card is fit for that purpose. So it all depends on what you are trying to do?

It's not possible to use any type of acceleration (ex: youtube, decode/encode files, ... ) inside the VM, and zero activity detected by host side.

Apparently, the VM is working the same way... and nothing is different with or without iGPU passthrough.

kroese commented 10 months ago

I know for certain that it works in Linux guests, as I use the same code in my other project ( https://github.com/vdsm/virtual-dsm ) where the GPU is used for accelerating facial recognition in photos, etc.

I never tried it in a Windows guest, so its possible that it does not work there (or needs special drivers to be installed). I created this container only one day ago, and even much less advanced features (like an audio device for sound) are not even implemented yet. So its better to focus first on getting the basics finished, and the very complicated/advanced stuff like GPU acceleration will be one of the last things on the list, sorry.

ladrive commented 10 months ago

I know for certain that it works in Linux guests, as I use the same code in my other project ( https://github.com/vdsm/virtual-dsm ) where the GPU is used for accelerating facial recognition in photos, etc.

I never tried it in a Windows guest, so its possible that it does not work there (or needs special drivers to be installed). I created this container only one day ago, and even much less advanced features (like an audio device for sound) are not even implemented yet. So its better to focus first on getting the basics finished, and the very complicated/advanced stuff like GPU acceleration will be one of the last things on the list, sorry.

I also use passthrough in several other containers (plex, jellyfin, frigate,...). Being able to achieve in this container can be big/great thing (applications design to only work with windows) . Sharing the iGPU with containers and not dedicating to a single VM can be a very economical, versatile and power efficient approach.

Looking forward to hearing from you in the future on this matter.

Despite this "issue", thanks for your hard work. 👍

Joly0 commented 10 months ago

I know for certain that it works in Linux guests, as I use the same code in my other project ( https://github.com/vdsm/virtual-dsm ) where the GPU is used for accelerating facial recognition in photos, etc.

I never tried it in a Windows guest, so its possible that it does not work there (or needs special drivers to be installed). I created this container only one day ago, and even much less advanced features (like an audio device for sound) are not even implemented yet. So its better to focus first on getting the basics finished, and the very complicated/advanced stuff like GPU acceleration will be one of the last things on the list, sorry.

Is definitely reasonable. Appreciate your hard work

kroese commented 10 months ago

I did some investigation and it seems its possible to have DirectX in a Windows Guest by using the virtio-gpu-gl display device and the experimental drivers from this topic: https://github.com/virtio-win/kvm-guest-drivers-windows/pull/943 .

The other option is PCI passthrough, but it is less nice in the sense that it requires exclusive access to the device, so you cannot use the same device for multiple containers. And its very complicated to support, because depending on the generation of the iGPU you will need to use different methods, for example SR-IOV for very recent Intel XE graphics, iGVT-g for others, etc. It will be impossible to add a short and universal set of instructions that will work for most graphics card to the FAQ.

CallyHam commented 9 months ago

If you use windows 7 as the guest and rdp into the container from a windows 7 machine that has aero enabled, you get aero glass effects in the vm, not sure if it accelerates anything else other than the desktop experience though

kieeps commented 9 months ago

I have to say that i was intrigued by the idea to run this container as a windows game streaming server and pass my Nvidia GPU through to the VM.... but looking through this and the qemus/qemu-docker project i understand that it would be a huge project :)

I'll probably find some other use case for this tho :D

tarunx commented 9 months ago

So using this project as game streaming server is not possible? Is there any other alternative game streaming that is hosted in docker?

kieeps commented 9 months ago

Not that I know of, my plan was to have a windows VM on my server that could run all the games my linux PC can't, but Nvidia and docker is not a fun project :-/ I don't know how to pass an Nvidia card through to a container without an extra nvidia-toolkit container as a layer in-between.

At least that's my guess :-) I could look in to it a bit more, if the container can access the GPU qemu should be able to use it with some modification I guess

Husky110 commented 9 months ago

@kieeps - Maybe I can be of help... :) I got the NVIDIA-Toolkit running on my server, so I can use some AI-Stuff. Maybe you checkout the docs for that see https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html so you don't need an extra container inbetween. If you could check that out, that would be great, cause I am considering to do the same, but I am busy tonight with another project... :)

Husky110 commented 9 months ago

@tarunx - Maybe try out using KasmVNC - they have a SteamContainer so it might be possible...

sweetbbak commented 8 months ago

passing a GPU through to Qemu is quite the process, in a container just adds an extra layer of issues. Typically you have to enable vfio_pci and iommu for your cpu type in the kernel modules. Then you use options to pass it through to Qemu. You can remotely connect to a running qemu instance (virt-manager is typically what people use)

then add in Docker/Podman and its a whole other thing. I bet someone has done it, but it doesn't sound easy necessarily. What I did was install Nix on a remote machine and followed this guide https://alexbakker.me/post/nixos-pci-passthrough-qemu-vfio.html and there are a lot of articles about the options that qemu needs. Im curious to see if someone tries this on top of Docker/Podman

m4tt72 commented 8 months ago

I already have my GPU passthrough enabled, is there an argument that I can pass to allow the windows guest to use the passthrough gpu?

alpha754293 commented 8 months ago

So I tried to follow the instructions here for the nvidia container toolkit and here for the docker-compose for GPU support -- no cigar yet.

I'm going to keep trying.

I am also trying to use Portainer to see if that may help me take some of the heavy lifting off from me.

I have already successfully passed through my 3090 from my Proxmox host to an Ubuntu 20.04 LXC container, and I was able to install the Linux NVIDIA drivers with the --no-kernel-module option (for said unprivileged LXC container), as verified by the output of nivdia-smi.

Now I am trying to pass it into Docker/Portainer, and then into the VM.

Will report back if I found a recipe/process that works.

edit The three issues that I am running into now are:

1) When I try to add it into the docker-compose.yml file in Portainer, it will deploy the VM, but the GPU won't show up.

2) If I try to bring up the VM via "normal" docker-compose, i.e. NOT through Portainer, I get this error message:

ERROR: The Compose file './docker-compose.yml' is invalid because:
services.win11.deploy.resources.reservations value Additional properties are not allowed ('devices' was unexpected)

3) If I take the lines back out that calls for the GPU device, try to edit the container from within Portainer, and then try to add it back in via the Portainer GUI, I get this error message (from Portainer) instead:

invalid CapDrop: capability not supported by your kernel or not available in the current environment: "CAP_MAC_ADMIN"

And the interesting about the last error message from Portainer is that the capability "MAC_ADMIN" is neither added nor enabled via the Portainer GUI, so I am not really sure why it thinks that it is asking for it, where it would then cause it to fail.

So no luck with Nvidia passthrough.

I've also tried changing the docker-compose.yml version number to something higher than version 3, (e.g. version 3.7) and it still gave me the same error message.

edit #2 I uninstalled the docker-compose package from the Ubuntu 20.04 repository and installed it with this command instead, from the Docker documentation page:

curl -SL https://github.com/docker/compose/releases/download/v2.24.7/docker-compose-linux-x86_64 -o /usr/local/bin/docker-compose

It's re-downloading the Windows 11 installer ISO image now.

edit #3 No luck.

It recognises my docker-compose.yml file, but will not attach my 3090 to the VM.

If I run the example from here, it will run nvidia-smi. So that tells me that the GPU passthrough is working to at least a Ubuntu container, just not the Windows VM.

The 3090 never shows up in Device Manager.

Husky110 commented 8 months ago

@alpha754293 - I don't know if you have tried this, but I found that passing through a singular GPU sometimes has it's drawbacks that in docker can't really work with it for whatever reason. I'm using the following snippet (and the analog docker run-partial) for my AI-related containers and those seem to get the passthrough and it works pretty well:

deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]

And for docker run I use --gpu all (see https://stackoverflow.com/questions/70761192/docker-compose-equivalent-of-docker-run-gpu-all-option ) So maybe passing "all" GPUs might help.
Plus - as far as I understand how the image works, it utilizes QEMU inside the container to actually start Windows in a virtual environment. So maybe the way to go is actually sort of two-staged.
Stage 1: Pass the GPU to the container and get it work there with drivers. <-- this one should be easy (except for the drivers...)
Stage 2: Pass the GPU inside the Stage1-container to QEMU - sort if treating the container as "bare metal" and trying to get it to run from there.
I think that might be a way to go.
Edit: Since you work with Portainer already, maybe check out KasmWorkspaces. Basically they are containers with a VNC-Connection to a desktop-environment, plus - they already have passthrough-container-images. Maybe using one of them as a base could be an idea for the Stage1-container. :)

alpha754293 commented 8 months ago

@Husky110 I did try that.

My CPU is an AMD Ryzen 5950X, so it has no iGPU, and therefore; my 3090 is the only GPU that's in the system.

Passing the GPU through from my Proxmox host to LXC container to the Ubuntu CUDA container, to run nvidia-smi works, so I know that it CAN be passed through.

But that's container-to-container passthrough.

Normally, to passthrough to a KVM VM, at least on my Proxmox host, I would have a hookscript which basically tells the VM that "hey, I am about to pass a GPU on through to you" and then goes ahead and does so.

This is my gpu-hookscript.pl that I have for my Proxmox host:

#!/usr/bin/perl
# Exmple hook script for PVE guests (hookscript config option)
# You can set this via pct/qm with
# pct set <vmid> -hookscript <volume-id>
# qm set <vmid> -hookscript <volume-id>
# where <volume-id> has to be an executable file in the snippets folder
# of any storage with directories e.g.:
# qm set 100 -hookscript local:snippets/hookscript.pl
use strict;
use warnings;
print "GUEST HOOK: " . join(' ', @ARGV). "\n";
# First argument is the vmid
my $vmid = shift;
# Second argument is the phase
my $phase = shift;
if ($phase eq 'pre-start') {
# First phase 'pre-start' will be executed before the guest
    # ist started. Exiting with a code != 0 will abort the start
print "$vmid is starting, doing preparations.\n";
system('echo 1 > /sys/bus/pci/devices/0000\:81\:00.0/remove');
system('echo 1 > /sys/bus/pci/rescan');
# print "preparations failed, aborting."
    # exit(1);
} elsif ($phase eq 'post-start') {
# Second phase 'post-start' will be executed after the guest
    # successfully started.
print "$vmid started successfully.\n";
} elsif ($phase eq 'pre-stop') {
# Third phase 'pre-stop' will be executed before stopping the guest
    # via the API. Will not be executed if the guest is stopped from
    # within e.g., with a 'poweroff'
print "$vmid will be stopped.\n";
} elsif ($phase eq 'post-stop') {
# Last phase 'post-stop' will be executed after the guest stopped.
    # This should even be executed in case the guest crashes or stopped
    # unexpectedly.
print "$vmid stopped. Doing cleanup.\n";
} else {
    die "got unknown phase '$phase'\n";
}
exit(0);

So, I can understand that GPU passthrough working from a host, like Proxmox, to a container, and then onto another container.

What is less clear is for the deployment of Windows via this Docker with --device=/dev/kvm method, how I would be able to do the same or something very similar as the GPU hookscript.

What is also interesting is that if you add the environment variable GPU: "Y", it will download the Intel GPU driver.

So, I am not sure if there is an Nvidia equivalent to that, as what's a valid option isn't always necessarily very well defined nor documented.

In any case, this is a neat concept in terms of being able to deploy Windows KVM VMs If you have a decently fast CPU and decently fast storage subsystem, this would be a different way to be able to spin up or deploy Windows VMs.

And I didn't test this, but if the Intel iGPU passthrough worked, then in theory, you can potentially use Intel QuickSync assist to do some of the hardware accelerated stuff.

Bummer that it didn't work with a Nvidia RTX 3090 though.

(But if it did, because it is a KVM VM, the GPU would only be confined to that VM vs. being able to be shared amongst multiple Docker containers.)

vivian-ng commented 8 months ago

I am also very interested in this idea, though I have not had the time to try out this yet.

Just a thought. When passing through a GPU to a QEMU VM on Linux, we usually need to blacklist the driver for the GPU or use the hookscript as mentioned to unload drivers. For a LXC container, is there any way to tell the container not to load the GPU drivers (whether it is nouveau or amdgpu or the Intel one) so that the GPU can be further passed through to the Windows VM running inside that container?

kroese commented 8 months ago

The Intel iGPU passthrough works, but it is not PCI pass-through where you can use the standard drivers, but a "shared" pass-through where you need a VirtIO driver in the guest OS. For Linux this driver is available and stable, but for Windows it is not mature yet (see https://github.com/virtio-win/kvm-guest-drivers-windows/pull/943 ).

So yes, even though Intel iGPU pass-through works, it is kind of useless as there is no DirectX driver available yet.

alpha754293 commented 8 months ago

@vivian-ng I'm not sure about that.

My experience so far, has been limited to passing the GPU through to a LXC container, and making sure that's working.

I'm not sure if there's a way to pass the GPU through, but then to have the LXC container NOT use said driver.

My thought is that if the LXC container is using the kernel of the host, and you've already blacklisted from the host kernel, but you've passed the GPU through, it doesn't have it's own /etc/modprobe.d/blacklist.

But maybe other people who have more experience than myself, might be better suited to answer thsi question.

@kroese re: Intel iGPU You might not be able to play games with it without DirectX support, but you might still be able to leverage Intel QuickSync technology for hardware accelerated stuff (that can use said Intel QuickSync).

kroese commented 8 months ago

How it works is that you pass a device called /dev/dri to the container, and then the script passes it onwards to QEMU. So it should also work for nVidia, because Im sure it provides the same /dev/dri device. Its just that I didnt have a nVidia device to test with, so currently it only install the Intel drivers when you set GPU=Y.

alpha754293 commented 8 months ago

@kroese The devices that are passed through for Nvidia GPUs are different:

/dev/nvidia0
/dev/nvidiactl 
/dev/nvidia-modeset 
/dev/nvidia-uvm
/dev/nvidia-uvm-tools
/dev/nvram

kroese commented 8 months ago

Okay, I didnt realize it was Intel-only. So then I assume that

/dev/dri/card0 is simular to /dev/nvidia0 with nVidia

In any case the relevant code is in https://github.com/qemus/qemu-docker/blob/master/src/display.sh so if someone is able to modify it so it supports nVidia too, that would be great.

alpha754293 commented 8 months ago

@kroese No worries.

Just sharing the very tiny bit of information that I know/that I can contribute.

(I'm not a developer, so a lot of this is wayyyy above what I can understand.)

For passing through an Intel iGPU (from an Intel N95 processor) to a unprivileged LXC container, I pass through:

/dev/dri
/dev/fb0

Husky110 commented 8 months ago

@alpha754293 - I was talking about passing through a "metal" GPU to the container and then passing the GPU inside the container to QEMU. I don't know how to do with Proxmox, since my setup is an old laptop with GTX 1080. And since I know that THAT works, the rest should have to be done inside the container. That's where I'm comming from. :) Maybe that contributes aswell.

alpha754293 commented 8 months ago

@Husky110 Pardon my less-than-intelligent question -- but how did you pass the GPU to a container and then pass said GPU to QEMU?

i.e. Did you have to blacklist the module/drivers from loading from within the container as well, in order for you to be able to then pass it on through to QEMU?

I've never tried passing through a GPU from a container to a VM.

Your help is greatly appreciated.

Husky110 commented 8 months ago

@alpha754293 - No worries. :D I was just talking about passing the GPU to the container. What I propose we should try is this: Metal -> Container -> QEMU inside the container
AFAIK (and I am not an expert) is that the Metal -> Container-Part is pretty easy via the Docker-Compose-File or docker run. I think the difficult part is to then passthrough the GPU from inside the container to QEMU. That is why I mentioned KasmWorkspaces, since those images are already capable of doing the metal -> container - Part whilst giving a desktop environment. Now the hard part is to figure out how to route the GPU to QEMU and using one of the Kasm-Images that already come prepacked with NVIDIA-drivers (or something like that which makes nvidia-smi to work) we could try to treat the container like we were doing the same thing on a regular-metal-PC (like installing QEMU in there and try to pass through the GPU, to have a proof-of-concept) which might ease stuff up. Then we "just" have to combine that efforts with the images here and voila - we should have it working.
It's just a thought on how to accomplish the desired outcome. :)

alpha754293 commented 8 months ago

@Husky110 "I was just talking about passing the GPU to the container. What I propose we should try is this: Metal -> Container ->" I already have this part working (again, via Proxmox) as I am able to pass the GPU from the host (metal) to the LXC container (container) (where I am running my Docker instance).

"That is why I mentioned KasmWorkspaces, since those images are already capable of doing the metal -> container - Part whilst giving a desktop environment. Now the hard part is to figure out how to route the GPU to QEMU and using one of the Kasm-Images that already come prepacked with NVIDIA-drivers (or something like that which makes nvidia-smi to work) we could try to treat the container like we were doing the same thing on a regular-metal-PC (like installing QEMU in there and try to pass through the GPU, to have a proof-of-concept) which might ease stuff up. Then we "just" have to combine that efforts with the images here and voila - we should have it working." In terms of passing the GPU through from said Docker container to the Windows KVM -- that's where I was trying to use Portainer to help me with that as they use the GPU to help set that up.

But I don't know if I would need to blacklist the module/drivers from the LXC container where I have Docker running, for that to work.

Normally, when I pass the GPU from host/metal to VM (QEMU/KVM) -- I would need to "tell" the VM to "take" the GPU. (You add a PCIe passthrough device to the VM configuration and also attach a hookscript which will "initialise" said GPU for the VM.)

The first piece of the puzzle, I already have up and running.

And I have deployment notes that is repeatable and stable.

I can play with trying to see if I can blacklist the Nvidia kernel modules/driver from the Ubuntu 20.04 LXC container that's running Docker, but I am still not quite certain as to what is the Docker -> KVM version of where would "attach" a GPU to the KVM, and then also tell said Ubuntu 20.04 LXC or Docker to "hand off" the GPU to said KVM.

(I also have deployment notes for how to do this with Proxmox and VMs in Proxmox that is also repeatable and stable.)

Husky110 commented 8 months ago

@alpha754293 - Try playing around with this... I don't know if QEMU is actually the way to go here or if another VM-tool might do it better. That's the part where I don't know anything about... :)

alpha754293 commented 8 months ago

@Husky110 Agreed.

I'm not sure neither.

kieeps commented 8 months ago

Is it possible to connect to the QEMU instance with virt-manager?

Joly0 commented 8 months ago

Is it possible to connect to the QEMU instance with virt-manager?

Nope, tried that already, doesnt work

jasonmbrown commented 8 months ago

I am trying to get it working with vfio-pci pass through but the underlying container doesn't seem to have support for vfio at this point... So I am trying to figure out how to get the drivers loaded. I got as far as configuring qemu and the host, but I cant get the containers debian base to recognize the device so it can be passed into the container...

Il probably try again later but if anyone has any information about what packages I need within debian? (Im thinking at this point I just get the vfio drivers load them into the container, then continue to pass through to the qemu)

alpha754293 commented 8 months ago

@jasonmbrown I would think that if you are using Debian (which is the Linux distro that runs underneath Proxmox), the instructions for passing through a GPU ought to be the same, no?

jasonmbrown commented 8 months ago

@jasonmbrown I would think that if you are using Debian (which is the Linux distro that runs underneath Proxmox), the instructions for passing through a GPU ought to be the same, no?

Ya but Its getting the drivers setup inside the container that I am unsure about.. Il probably work on getting it running again later though. (Learning lots about linux now)

Husky110 commented 8 months ago

Just another random input I just learned: Playing inside the container might have some drawbacks itself... EasyAntiCheat seems to have "problems" with virtual computers - appearently Valorant bans people playing on ProxMox... I wonder about the remaining usecases then, since Proton is actually pretty far already...

dockur / windows

GPU Passthrough #22