zelikos / davincibox

Container for DaVinci Resolve installation and runtime dependencies on Linux
Apache License 2.0
201 stars 8 forks source link

`setup.sh` GPU check does not account for VFIO #113

Closed noxifoxi closed 2 months ago

noxifoxi commented 2 months ago

Describe the bug

The script recognises my AMD GPU as an NVIDIA GPU, which I assume is what is causing the problem.

❯ ./setup.sh
Distrobox found.
Nvidia GPU detected.
...

When launching resolve normally I get the error described here: #74

I had to change the .desktop entry to

distrobox-enter -n davincibox -- /usr/bin/run-davinci rusticl %u

in order to force rusticl and for it to work.

Installation method used

To Reproduce

  1. run the script normally
  2. click through the installer
  3. try launching Resolve

Expected behavior

The script should detect the GPU correctly and set rusticl as the default for AMD GPUs so there is no tinkering required afterwards.

System information (please complete the following)

zelikos commented 2 months ago

Strange. It's working as expected for me, on an RX 6600 XT.

Show me the output for lshw -c video 2>/dev/null | grep -i nvidia on both the host and in the container

noxifoxi commented 2 months ago

I think the script is having an issue excluding inactive GPUs. I have an NVIDIA GPU installed for VM passthrough, the GPU is not used at all by the host OS as indicated by driver=vfio-pci under configuration.

lshw -c video

  *-display                 
       description: VGA compatible controller
       product: Navi 21 [Radeon RX 6800/6800 XT / 6900 XT]
       vendor: Advanced Micro Devices, Inc. [AMD/ATI]
       physical id: 0
       bus info: pci@0000:03:00.0
       logical name: /dev/fb0
       version: c1
       width: 64 bits
       clock: 33MHz
       capabilities: vga_controller bus_master cap_list rom fb
       configuration: depth=32 driver=amdgpu latency=0 resolution=2560,1440
       resources: iomemory:f80-f7f iomemory:fc0-fbf irq:142 memory:f800000000-fbffffffff memory:fc00000000-fc0fffffff ioport:f000(size=256) memory:f5f00000-f5ffffff memory:f6000000-f601ffff
  *-display
       description: VGA compatible controller
       product: GP106 [GeForce GTX 1060 6GB]
       vendor: NVIDIA Corporation
       physical id: 0
       bus info: pci@0000:0e:00.0
       version: a1
       width: 64 bits
       clock: 33MHz
       capabilities: vga_controller cap_list rom
       configuration: driver=vfio-pci latency=0
       resources: iomemory:fc0-fbf iomemory:fc0-fbf irq:255 memory:f4000000-f4ffffff memory:fc20000000-fc2fffffff memory:fc30000000-fc31ffffff ioport:d000(size=128) memory:f5000000-f507ffff
  *-display
       description: VGA compatible controller
       product: Raphael
       vendor: Advanced Micro Devices, Inc. [AMD/ATI]
       physical id: 0
       bus info: pci@0000:18:00.0
       version: c2
       width: 64 bits
       clock: 33MHz
       capabilities: vga_controller bus_master cap_list
       configuration: driver=amdgpu latency=0
       resources: iomemory:fc0-fbf iomemory:fc0-fbf irq:106 memory:fc40000000-fc4fffffff memory:fc50000000-fc501fffff ioport:e000(size=256) memory:f5e00000-f5e7ffff

lshw -c video 2>/dev/null | grep -i nvidia for good measure:

Host:

       vendor: NVIDIA Corporation

Container:

       vendor: NVIDIA Corporation
zelikos commented 2 months ago

The script basically just uses the line I requested you to run for its GPU detection, so more niche situations like secondary GPUs for VM passthrough aren't currently accounted for.

Specifically, the setup.sh script checks whether or not an Nvidia GPU is present (as, if an Nvidia GPU is there, it's usually likely that it's the primary GPU) then applies Nvidia-specific options if it is. The container also does a GPU check to apply relevant workarounds, but it checks for Nvidia first for the aforementioned reason.

tl;dr: Whatever fix is done for this will need to be done in both setup.sh and /etc/profile.d/davinci.sh. e.g. checking for driver=<driver-name> rather than checking vendor name like we do now

In the meantime, your workaround of using run-davinci should suffice; if you need to run any of the other programs included with Resolve, you can also edit /etc/profile.d/davinci.sh in your container and remove the Nvidia GPU check for now.