ublue-os / hwe

Fedora variants with support for ASUS devices, Nvidia devices, and Surface laptops
https://universal-blue.org/images/hwe
Apache License 2.0
172 stars 37 forks source link

Extra NVIDIA packages #35

Closed RedTopper closed 1 year ago

RedTopper commented 1 year ago

Hey! I've been using Kinoite with Nvidia for quite some time and love the idea of having a pre-packaged Nvidia variant available! I'm currently layering these two extra packages on my client that others may find useful:

nvidia-container-toolkit: Very useful for exposing GPU drivers to Podman. Officially, Fedora isn't supported by the package, but the CentOS8/RHEL package should work just fine (I'm using the centos8 package for my containers right now).

It does require some fiddling with SELinux labels which this guide by RedHat explains. I'm not sure if any of that can be included OOTB with the OCI images though.

nvidia-vaapi-driver (and libva-utils for vainfo): Could be useful for those wanting hardware accelerated decoding in Firefox. Unfortunately it doesn't play very nicely with Flatpak Firefox quite yet, but I think there could still be merit including it in this image.

Just some things to consider, I can still layer them just fine on my client.

xynydev commented 1 year ago

Someone on the thread you linked hacked together a solution for Flatpak FF & Nvidia VAAPI (https://github.com/elFarto/nvidia-vaapi-driver/issues/23#issuecomment-1369391278).

I agree with adding the container toolkit and vaapi driver in the builds by default, but we could also setup a justfile for setting that kargs and doing the nvidia/flatpak/ff setup, like the base image does.

RedTopper commented 1 year ago

Yeah, the workaround is what I've been using and I can confirm it works great! Though every time the vaapi driver updates you have to run the script. So it's just something that'll need a workaround or documentation for those who use flatpak Firefox.

Thanks for considering adding these packages!

castrojo commented 1 year ago

Since you need to reboot anyway would a systemd unit on boot be too horrible of a hack?

RedTopper commented 1 year ago

Nah, that sounds like a perfectly reasonable solution actually.

xynydev commented 1 year ago

Though every time the vaapi driver updates you have to run the script.

This could probably be fixed by running the command that copies the latest version into the correct directory inside the containerfile.

I've been getting into contributing to this repo, so I might look at it on the weekend. I'm wondering if there should be some sort of slim image that doesn't contain these packages.

RedTopper commented 1 year ago

This could probably be fixed by running the command that copies the latest version into the correct directory inside the containerfile.

It also needs to set the environment variables on the flatpak firefox. But if that can be done, as long as it's copied to a location that flatpak doesn't blocklist and firefox is allowed to read from it'll work.

I'm wondering if there should be some sort of slim image that doesn't contain these packages.

Here's the package sizes with dependencies retrieved using DNF if that helps you make a decision:

===========================================================================================================================================
 Package                                      Architecture          Version                       Repository                          Size
===========================================================================================================================================
Installing:
 nvidia-container-toolkit                     x86_64                1.12.0-1                      libnvidia-container                801 k
Installing dependencies:
 libnvidia-container-tools                    x86_64                1.12.0-1                      libnvidia-container                 54 k
 libnvidia-container1                         x86_64                1.12.0-1                      libnvidia-container                1.0 M
 libseccomp                                   x86_64                2.5.3-3.fc37                  fedora                              70 k
 nvidia-container-toolkit-base                x86_64                1.12.0-1                      libnvidia-container                2.9 M

Transaction Summary
===========================================================================================================================================
Install  5 Packages

Total download size: 4.8 M
Installed size: 14 M
Is this ok [y/N]: n
===========================================================================================================================================
 Package                             Architecture           Version                        Repository                                 Size
===========================================================================================================================================
Installing:
 libva-utils                         x86_64                 2.16.0-1.fc37                  updates                                   572 k
 nvidia-vaapi-driver                 x86_64                 0.0.8-1.fc37                   rpmfusion-nonfree-updates                  49 k

Transaction Summary
===========================================================================================================================================
Install  2 Packages

Total download size: 621 k
Installed size: 2.8 M
Is this ok [y/N]: n
joshua-stone commented 1 year ago

@RedTopper We may be able to add nvidia-container-runtime support. Can you test nvidia-container-runtime support works?

$ rpm-ostree rebase ostree-unverified-registry:ghcr.io/ublue-os/silverblue-nvidia:pr-43
RedTopper commented 1 year ago

Sure!

rpm-ostree status
State: idle
Deployments:
● ostree-unverified-registry:ghcr.io/ublue-os/kinoite-nvidia:pr-43
                   Digest: sha256:a46f37b691deab4e8a2a3c3da39b7888a5826304f4daa95d3f21dd14b59ddb31
                  Version: 37.20230216.0 (2023-02-16T16:49:35Z)
      RemovedBasePackages: firefox firefox-langpacks 109.0.1-1.fc37
          LayeredPackages: arc-theme cockpit cockpit-podman cockpit-selinux docker-compose ffmpeg-free filelight libavcodec-freeworld
                           libi2c-devel libratbag-ratbagd libva-utils lm_sensors nvidia-vaapi-driver nvtop podman-docker qterminal
                           steam-devices xpadneo xrdp zsh
                Initramfs: --force-add tpm2-tss

And from the README update

podman run \
    --user 1000:1000 \
    --security-opt=no-new-privileges \
    --cap-drop=ALL \
    --security-opt label=type:nvidia_container_t  \
    docker.io/mirrorgooglecontainers/cuda-vector-add:v0.1
[Vector addition of 50000 elements]
Copy input data from the host memory to the CUDA device
CUDA kernel launch with 196 blocks of 256 threads
Copy output data from the CUDA device to the host memory
Test PASSED
Done

Though to be fair this isn't a clean install, so I already followed the guide and ran restorecon + the config.toml edits.

Also a side note: I was previously using rpmfusion and when rebasing to the new branch I got error: Packages not found: libavcodec-freeworld, nvidia-vaapi-driver, steam-devices. I was able to edit the /etc/yum.repos.d/rpmfusion-<repo>.repo files and enable them, but I was wondering if that's the expected UX for installing additional packages from rpmfusion.

ostree now shows the repo files as modified

sudo ostree admin config-diff
<snip>
M    kernel/cmdline
M    nvidia-container-runtime/config.toml
M    pki/akmods/certs
M    pki/akmods/private
<snip>
M    yum.repos.d/rpmfusion-nonfree-steam.repo
M    yum.repos.d/rpmfusion-free.repo
M    yum.repos.d/rpmfusion-nonfree-updates.repo
M    yum.repos.d/rpmfusion-nonfree.repo
...
joshua-stone commented 1 year ago

@RedTopper We disable rpmfusion and other repos at the end of the build to speed up metadata updates. Ideally any nvidia-specific package from rpmfusion should already be provided, and re-enabling them later if needed is better than having to add them back as layered packages. We may end up adding a just config for simplifying the enabling of these repos among other features.

joshua-stone commented 1 year ago

@RedTopper Changes have been deployed for rootless support. Can you test against the latest version of ostree-unverified-registry:ghcr.io/ublue-os/silverblue-nvidia:pr-43 ?

RedTopper commented 1 year ago

We disable rpmfusion and other repos at the end of the build to speed up metadata updates.

Ah, understandable. I didn't see any mention of that anywhere so I didn't know if that was the intended way to go about that.

Changes have been deployed for rootless support.

I did an update and I fell back to nouveau, so something must be up. Rolling back restores the NVIDIA driver. Did the driver get signed properly?

joshua-stone commented 1 year ago

@RedTopper Did you enable secure boot support? Every PR build generates a temporary key to ensure it can build even when outside contributors are submitting changes.

RedTopper commented 1 year ago

Oh, that would make sense. I can install the new key and try again (I expected it to be the same)

joshua-stone commented 1 year ago

There are more changes being pushed for this issue, so if you update against :pr-43 you'd have to enroll the key again or disable secure boot.

RedTopper commented 1 year ago

Looking good. I rolled back my changes to config.toml by copying the original from /usr/etc and podman could use the GPU in rootless.

I already installed the selinux module prior to switching, but I can confirm there is that it's still being loaded.

sudo semodule -l | grep nvidia
nvidia-container

and my ostree admin:

sudo ostree admin config-diff | grep nvidia 
M    selinux/targeted/active/modules/400/nvidia-container/cil
M    selinux/targeted/active/modules/400/nvidia-container/hll
D    systemd/system/multi-user.target.wants/nvidia-powerd.service
A    yum.repos.d/nvidia-docker.repo

Not sure if I should revert the selinux/ files or not.

(I'll delete the .repo when this gets merged)

I can give the "it works on my machine" seal at least. Let me know if you need more details/testing.

joshua-stone commented 1 year ago

That's promising to hear so far! Can you confirm that nothing concerning shows up in logs during the testing? Something like this should show no obvious permission errors:

$ journalctl --boot | grep -i nvidia
$ sudo ausearch -i --start recent | grep -i nvidia
joshua-stone commented 1 year ago

As for ostree admin, it looks like the policies are already in place because this returns nothing when I run the image in a VM:

$ sudo ostree admin config-diff | grep nvidia
RedTopper commented 1 year ago

Ok, since you have no changes to the selinux files I'll copy them off somewhere and roll mine back. I want to make sure my configuration is as close as reasonably possible to "default" for this.

After rolling those files back, journalctl doesn't produce anything interesting, ausearch returns no output, and my containers still have access to the GPU.

joshua-stone commented 1 year ago

@RedTopper Changes have been merged upstream! That should leave testing vaapi support:

https://github.com/ublue-os/nvidia/pull/44

$ rpm-ostree rebase ostree-unverified-registry:ghcr.io/ublue-os/silverblue-nvidia:pr-44
RedTopper commented 1 year ago

Got around to testing:

Rebased and uninstalled my currently overlayed packages

❯ rpm-ostree rebase ostree-unverified-registry:ghcr.io/ublue-os/kinoite-nvidia:pr-44 --uninstall ffmpeg-free --uninstall nvidia-vaapi-driver --uninstall libavcodec-freeworld --uninstall libva-utils

reboot\

❯ rpm-ostree status
State: idle
Deployments:
● ostree-unverified-registry:ghcr.io/ublue-os/kinoite-nvidia:pr-44
                   Digest: sha256:dfb0e203ca6f8581b0e1979897452ef3a880faebff4941fcef2e5239e0b6f170
                  Version: 37.20230216.0 (2023-02-17T15:31:04Z)
      RemovedBasePackages: firefox firefox-langpacks 109.0.1-1.fc37
          LayeredPackages: arc-theme cockpit cockpit-podman cockpit-selinux docker-compose filelight libi2c-devel libratbag-ratbagd
                           lm_sensors nvtop podman-docker qterminal steam-devices xpadneo xrdp zsh
                Initramfs: --force-add tpm2-tss 
❯ vainfo
Trying display: wayland
Trying display: x11
libva info: VA-API version 1.16.0
libva info: Trying to open /usr/lib64/dri/nvidia_drv_video.so
libva info: Found init function __vaDriverInit_1_0
libva info: va_openDriver() returns 0
vainfo: VA-API version: 1.16 (libva 2.16.0)
vainfo: Driver version: VA-API NVDEC driver [egl backend]
vainfo: Supported profile and entrypoints
      VAProfileMPEG2Simple            : VAEntrypointVLD
      VAProfileMPEG2Main              : VAEntrypointVLD
      VAProfileVC1Simple              : VAEntrypointVLD
      VAProfileVC1Main                : VAEntrypointVLD
      VAProfileVC1Advanced            : VAEntrypointVLD
      <unknown profile>               : VAEntrypointVLD
      VAProfileH264Main               : VAEntrypointVLD
      VAProfileH264High               : VAEntrypointVLD
      VAProfileH264ConstrainedBaseline: VAEntrypointVLD
      VAProfileHEVCMain               : VAEntrypointVLD
      VAProfileVP8Version0_3          : VAEntrypointVLD
      VAProfileVP9Profile0            : VAEntrypointVLD
      VAProfileAV1Profile0            : VAEntrypointVLD
      VAProfileHEVCMain10             : VAEntrypointVLD
      VAProfileHEVCMain12             : VAEntrypointVLD
      VAProfileVP9Profile2            : VAEntrypointVLD
❯ ffmpeg -codecs 2> /dev/null | grep -e nvenc -e vaapi
 DEV.LS h264                 H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10 (decoders: h264 h264_v4l2m2m h264_qsv h264_cuvid ) (encoders: libx264 libx264rgb h264_amf h264_nvenc h264_qsv h264_v4l2m2m h264_vaapi )
 DEV.L. hevc                 H.265 / HEVC (High Efficiency Video Coding) (decoders: hevc hevc_qsv hevc_v4l2m2m hevc_cuvid ) (encoders: libx265 hevc_amf hevc_nvenc hevc_qsv hevc_v4l2m2m hevc_vaapi )
 DEVIL. mjpeg                Motion JPEG (decoders: mjpeg mjpeg_cuvid mjpeg_qsv ) (encoders: mjpeg mjpeg_qsv mjpeg_vaapi )
 DEV.L. mpeg2video           MPEG-2 video (decoders: mpeg2video mpegvideo mpeg2_v4l2m2m mpeg2_qsv mpeg2_cuvid ) (encoders: mpeg2video mpeg2_qsv mpeg2_vaapi )
 DEV.L. vp8                  On2 VP8 (decoders: vp8 vp8_v4l2m2m libvpx vp8_cuvid vp8_qsv ) (encoders: libvpx vp8_v4l2m2m vp8_vaapi )
 DEV.L. vp9                  Google VP9 (decoders: vp9 vp9_v4l2m2m libvpx-vp9 vp9_cuvid vp9_qsv ) (encoders: libvpx-vp9 vp9_vaapi vp9_qsv )

Then I copied the new driver to the flatpak location

 ❯ cp /usr/lib64/dri/nvidia_drv_video.so ~/.var/app/org.mozilla.firefox/dri 

and I can still see Firefox using compute when a video is played back. So it looks good to me.

image

joshua-stone commented 1 year ago

@RedTopper We may end up adding a just setup script for Firefox flatpak in a later change. How does this look?

flatpak override \
    --user \
    --filesystem=host-os \
    --env=LIBVA_DRIVER_NAME=nvidia \
    --env=LIBVA_DRIVERS_PATH=/run/host/usr/lib64/dri \
    --env=LIBVA_MESSAGING_LEVEL=1 \
    --env=MOZ_DISABLE_RDD_SANDBOX=1 \
    --env=NVD_BACKEND=direct \
    org.mozilla.firefox
for PROFILE in ~/.var/app/org.mozilla.firefox/.mozilla/firefox/*.default-release-*/prefs.js; do
    grep -q 'user_pref("gfx.webrender.all", true);' "${PROFILE}" || echo 'user_pref("gfx.webrender.all", true);' >> "$PROFILE"
    grep -q 'user_pref("media.ffmpeg.vaapi.enabled", true);' "${PROFILE}" || echo 'user_pref("media.ffmpeg.vaapi.enabled", true);' >> "$PROFILE"
done
RedTopper commented 1 year ago

I haven't heard of just before, but it seems reasonable to use that. On the script:

--filesystem=host-os: I feel like it'd be better to copy the driver in the Containerfile to a non flatpak bloclisted location and then grant Firefox that path specifically rather than giving Firefox access to all host libraries. But if there isn't a good spot this solution is fine by me since enabling it on Flatseal doesn't seem to do anything crazy.

for PROFILE in ~/.var/app/org.mozilla.firefox/.mozilla/firefox/*.default-release-*/prefs.js; do: My firefox profile is zi8e4zmi.default-release, so I had to modify this line to be .default-release* with no -

joshua-stone commented 1 year ago

I'm not sure there is an ideal solution for Firefox flatpak because ostree container commit requires many directories outside of /usr to be empty:

https://github.com/ublue-os/nvidia/blob/303bf28d71220264d979f01f7311c0abc7e9a0cc/Containerfile#L107

Setting --filesystem=/usr/lib64/dri would've been more ideal, but that doesn't appear to work. I also don't think copying nvidia_drv_video.so into the home directory would work well if users rebased onto a different Nvidia driver branch.

That leaves --filesystem=host-os, which is far from ideal, but it works.

joshua-stone commented 1 year ago

https://github.com/ublue-os/nvidia/pull/44 has been merged, so we can close this.

RedTopper commented 1 year ago

Hmm, that's true. You bring up a good point about rebasing. Both a systemd unit or having the just script copy it to a user mutable location would leave a driver file dangling if the user rebases back onto a different branch or release...

I also can't think of another solution. Firefox shouldn't pull any libs from /run/host/usr by default anyway (unless we tell it to like we are now), and since firefox isn't running as root (hopefully), that directory is effectively ro.

So after thinking about it from multiple different angles, I'm inclined to agree.

joshua-stone commented 1 year ago

The intel-vaapi-driver package appears to have been packaged as org.freedesktop.Platform.VAAPI.Intel. I believe packaging nvidia-vaapi-driver as org.freedesktop.Platform.VAAPI.Nvidia would probably be the best solution so that Firefox flatpak no longer needs to depend on the host to provide hardware acceleration libraries.

RedTopper commented 1 year ago

Oh yeah, that's the optimal solution. It looks like the original issue I mentioned had a link to an issue on gitlab for doing just that. And in theory flatpak would install it automatically when it finds the one on the host.

In other news I pulled ublue-os/kinoite-nvidia:latest and everything is working, including the suggested host-os permission for firefox! Thanks a ton you guys, I now don't have to layer the following packages, which has significantly reduced the headache of using kinoite:

LayeredPackages:
akmod-nvidia
ffmpeg-free
libavcodec-freeworld
libva-utils 
nvidia-container-toolkit
nvidia-vaapi-driver
nvtop
rpmdevtools
rpmfusion-free-release
rpmfusion-nonfree-release
xorg-x11-drv-nvidia
xorg-x11-drv-nvidia-cuda

LocalPackages: 
akmods-keys-0.0.2-8.fc37.noarch

Thanks!