nestriness / nestri

[Experimental] An open-source, self-hosted Geforce Now alternative
https://nestri.io
GNU Affero General Public License v3.0
1.49k stars 25 forks source link

✨ feat(server): Add Intel/AMD GPU support #84

Closed DatCaptainHorse closed 2 months ago

DatCaptainHorse commented 3 months ago

Description

This is a DRAFT - Changes will be discussed and made upon requests!

In nutshell, this adds support for running Nestri with Intel and AMD GPU's. Both integrated and dedicated.

It took a few days to find a trick for having output without dummy plugs or connected displays, but I think I got it.

gpu-screen-recorder requires a custom patch to skip the check for connected displays (as we're using a xrandr workaround which makes them stay "unconnected")

Most likely fixes #68

Changes

The NVIDIA sections have been split in their own code branches since there's some NVIDIA specific things I didn't feel approriate to poke more than necessary for the goal of this PR.

Added a script with helper functions related to GPU discovery and gathering some basic info off from them (note: it might be better to declare the helper script arrays outside it's initially run function). The helper scripts rely on lshw.

NVIDIA code was slightly adjusted to use the bus-id's provided by the helper functions to have some code re-use.

Cleaned up few things on the side.

wanjohiryan commented 3 months ago

Hey @DatCaptainHorse sorry for the wait

This looks really nice, in fact you have touched on some painpoints i had not gotten to iron out.

However, what does glxinfo | grep vendor give you while using the Intel GPU?

DatCaptainHorse commented 3 months ago

Welcome back @wanjohiryan !

Here's the requested command output:

server glx vendor string: SGI
client glx vendor string: Mesa Project and SGI
OpenGL vendor string: Intel

Here's also glxinfo -B for good measure:

name of display: :0
display: :0  screen: 0
direct rendering: Yes
Extended renderer info (GLX_MESA_query_renderer):
    Vendor: Intel (0x8086)
    Device: Mesa Intel(R) UHD Graphics P630 (CFL GT2) (0x3e96)
    Version: 24.0.9
    Accelerated: yes
    Video memory: 63911MB
    Unified memory: yes
    Preferred profile: core (0x1)
    Max core profile version: 4.6
    Max compat profile version: 4.6
    Max GLES1 profile version: 1.1
    Max GLES[23] profile version: 3.2
OpenGL vendor string: Intel
OpenGL renderer string: Mesa Intel(R) UHD Graphics P630 (CFL GT2)
OpenGL core profile version string: 4.6 (Core Profile) Mesa 24.0.9-0ubuntu1
OpenGL core profile shading language version string: 4.60
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile

OpenGL version string: 4.6 (Compatibility Profile) Mesa 24.0.9-0ubuntu1
OpenGL shading language version string: 4.60
OpenGL context flags: (none)
OpenGL profile mask: compatibility profile

OpenGL ES profile version string: OpenGL ES 3.2 Mesa 24.0.9-0ubuntu1
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.20
wanjohiryan commented 3 months ago

Thank you 😊

Wow! So you figured out the Intel hardware acceleration. This is sooooo cool, i cannot express my excitement right now.

A few questions:

  1. The code you provided does not build on CI, could you fix that?
  2. Does gamescope, vkcube and/or glxgears run while inside the container?
  3. If I (more like the company) provided a remote G4ad AWS instance (has an AMD gpu)... would you look into helping us support AMD gpus? (I couldn't help but notice that you were requesting for access to an AMD gpu to test this out)

Thanks @DatCaptainHorse you are a life saver.

DatCaptainHorse commented 3 months ago

Answers to your questions @wanjohiryan:

  1. It looks like it's the gpu-screen-recorder update (fixed by #83 - just merge that and I'll rebase this PR to make it pass!)
  2. Haven't tried gamescope (I'll do that later today), but vkcube, glxgears and various other games and applications run just fine, fully GPU accelerated πŸ‘
  3. This PR should also work with AMD GPU's without modifications, just PR 83 needs to be merged first so I can remove the intel-specific ffmpeg-path and just use gpu-screen-recorder for every vendor. I'd need a way to test that though, that being the reason I was requesting access to one.

You're welcome! I've got an additional Intel Arc A310 GPU hopefully arriving next week, so I can do further testing with same-vendor-multiple-gpu setups (and then it'll be my friend's remote-gaming GPU) πŸ™‚

wanjohiryan commented 3 months ago

Hi @DatCaptainHorse

I just merged #83 so you have the green light to rebase this commit.

I have yet to test this PR on a non-Nvidia GPU... are there any "gotchas"/problems you have encountered?

DatCaptainHorse commented 3 months ago

@wanjohiryan Thank you!

If you have multiple GPUs in test-system, you need to pass GPU_SELECTION=vendor:N variable to container or have it exported from somewhere to make it find and use the proper GPU.

i.e. in my server with NVIDIA Quadro as first gpu, and Intel iGPU as second gpu (odd but yeah):

If the selection given is invalid or can't be found, it will default to first gpu found. Feel free to suggest changes for the GPU selection method if you wish :pray:

DatCaptainHorse commented 3 months ago

Pushed changes which fix and iron out issues encountered. Made improvements to gpu_helpers and did some cleanup + improved error handling.

Tested and confirmed to work with AMD GPU for the most part :tada: Thanks to help of mimi07 for providing a VM to test with!

GPU_SELECTION=vendor:N's index is now per-vendor. For example if you have a NVIDIA GPU as 1st GPU and Intel iGPU as 2nd GPU, you'd choose NVIDIA with GPU_SELECTION=nvidia:0 or Intel with GPU_SELECTION=intel:0 - I felt this would be easier to understand and use in certain situations.

To run a container in non-NVIDIA systems, --cap-add='SYS_ADMIN' is a required docker parameter so container can access the GPU resources, of course along with --device=/dev/dri/ so the GPU(s) are visible to the container.

wanjohiryan commented 3 months ago

Thank you for everything, so do we move this into a proper PR and not a WIP

Also, now that RADEON PRO has graphical issues, how do we handle that?

DatCaptainHorse commented 3 months ago

Also, now that RADEON PRO has graphical issues, how do we handle that?

Was likely permission issue (chmod 777 -R /dev/dri/ fixed on setup with same errors), note that self-hosting docs need to be created and I can gladly contribute to writing those, it'd be a waste to not document the findings :slightly_smiling_face:

I consider this PR ready for review now, tested on various cases.

Non-Mixed GPU Tests

1 GPU 2+ GPUs
NVIDIA βœ… ❔
Intel βœ… βœ…
AMD βœ… ❔

Mixed GPU Tests

Tested
NVIDIA + Intel βœ…
NVIDIA + AMD ❔
AMD + Intel ❔
AMD + NVIDIA ❔
DatCaptainHorse commented 2 months ago

Improved output handling yet again once I found an annoyance with Docker, the container is able to see all outputs, even from GPUs that are not passed to it, so the script would fail when trying to configure iGPUs output while selected GPU was dedicated GPU :sweat_smile:

Now xrandr configuring is done in loop for each output until one works, bit bruteforcy but it works better than just failing immediately.

Edit: Also I noticed that Xorg has fakescreenfps option which worked better than AsyncFlipSecondaries option at making sure Xorg never "sleeps", dropping down to 1FPS. I also made gpu-screen-recorder use the $REFRESH environment variable, I hope that's fine as I don't see reason to cap it at 60?

wanjohiryan commented 2 months ago

Thank you @DatCaptainHorse