netbrain / zwift

Easily zwift on linux
The Unlicense
242 stars 27 forks source link

[HELP (Not Launching/ Crashing)] Looking for an nvidia library that doesn't exist #163

Closed msmurphy closed 2 hours ago

msmurphy commented 21 hours ago

Checklist

Describe the issue

It's looking for the wrong nvidia library. I recently updated from driver 555 to 560. Running via podman or docker fails.

It outputs that it's looking for libEGL_nvidia.so.550.54.14. This doesn't exist on my machine anymore.

mike@pop-os:~$ DEBUG=1 zwift
+ [[ -f /home/mike/.config/zwift/config ]]
+ ZWIFT_CONFIG_FLAG='--env-file /home/mike/.config/zwift/config'
+ source /home/mike/.config/zwift/config
+ [[ -f /home/mike/.config/zwift/mike-config ]]
+ [[ ! -z '' ]]
+ WINDOW_MANAGER=Other
+ IMAGE=docker.io/netbrain/zwift
+ VERSION=latest
+ NETWORKING=bridge
++ id -u
+ ZWIFT_UID=1000
++ id -g
+ ZWIFT_GID=1000
+ '[' '!' ']'
++ command -v podman
+ '[' -x /bin/podman ']'
+ CONTAINER_TOOL=podman
+ '[' podman == podman ']'
+ LOCAL_UID=1000
+ LOCAL_GID=1000
+ CONTAINER_UID=1000
+ CONTAINER_GID=1000
+ case "$XDG_SESSION_TYPE" in
+ WINDOW_MANAGER=XOrg
+ '[' XOrg = Wayland ']'
+ [[ ! -n '' ]]
++ curl -s https://raw.githubusercontent.com/netbrain/zwift/master/zwift.sh
++ sha256sum
++ awk '{print $1}'
+ REMOTE_SUM=732988328e5b86a1174a828a9915f225b024b235d625867a8d2fc092dc09b76f
++ sha256sum /usr/local/bin/zwift
++ awk '{print $1}'
+ THIS_SUM=732988328e5b86a1174a828a9915f225b024b235d625867a8d2fc092dc09b76f
+ '[' 732988328e5b86a1174a828a9915f225b024b235d625867a8d2fc092dc09b76f = 732988328e5b86a1174a828a9915f225b024b235d625867a8d2fc092dc09b76f ']'
+ echo 'You are running latest zwift.sh 👏'
You are running latest zwift.sh 👏
+ [[ ! -n '' ]]
+ podman pull docker.io/netbrain/zwift:latest
Trying to pull docker.io/netbrain/zwift:latest...
Getting image source signatures
Copying blob 91d3c72062c8 skipped: already exists
Copying blob 961770e3c8f4 skipped: already exists
Copying blob b1e7716bc2ae skipped: already exists
Copying blob f4cc7fab2776 skipped: already exists
Copying blob 891fc5a514d6 skipped: already exists
Copying blob 5a3f8924da7b skipped: already exists
Copying blob ec2fc8bb93af skipped: already exists
Copying blob a77fe582be47 skipped: already exists
Copying blob bb3aa9ee0979 skipped: already exists
Copying blob 65c4ff73c14f skipped: already exists
Copying blob de36efc410c7 skipped: already exists
Copying blob 27c225951703 skipped: already exists
Copying blob b060632551dc skipped: already exists
Copying blob 49bc3a60e777 skipped: already exists
Copying blob 4f4fb700ef54 skipped: already exists
Copying blob 1f28b6854218 skipped: already exists
Copying blob 4f4fb700ef54 skipped: already exists
Copying blob 4f4fb700ef54 skipped: already exists
Copying blob 30cf83acf330 skipped: already exists
Copying blob 4f4fb700ef54 skipped: already exists
Copying blob 8360dfe43bf3 skipped: already exists
Copying blob 5412401d128e skipped: already exists
Copying blob 4f4fb700ef54 skipped: already exists
Copying blob da24a42f31a7 skipped: already exists
Copying blob a7e5916ecbd2 skipped: already exists
Copying blob bb547e1678b9 skipped: already exists
Copying blob 1647aba5e82a skipped: already exists
Copying blob 6216c745319d skipped: already exists
Copying blob fce0e32f5882 skipped: already exists
Copying blob e070778738a5 skipped: already exists
Copying blob 91503bf155c1 skipped: already exists
Copying config 82355aea89 done   |
Writing manifest to image destination
82355aea894d69a6150356e550b2410fb07f06e07bd72e2ab15fc83608befc4b
+ GENERAL_FLAGS=(-d --rm --privileged --network $NETWORKING --name zwift-$USER --security-opt label=disable --hostname $HOSTNAME -e DISPLAY=$DISPLAY -e ZWIFT_UID=$CONTAINER_UID -e ZWIFT_GID=$CONTAINER_GID -e PULSE_SERVER=/run/user/$CONTAINER_UID/pulse/native -v zwift-$USER:/home/user/.wine/drive_c/users/user/Documents/Zwift -v /run/user/$LOCAL_UID/pulse:/run/user/$CONTAINER_UID/pulse)
+ [[ -f /proc/driver/nvidia/version ]]
+ [[ podman == \p\o\d\m\a\n ]]
+ VGA_DEVICE_FLAG=--device=nvidia.com/gpu=all
+ [[ -n unix:path=/run/user/1000/bus ]]
+ [[ unix:path=/run/user/1000/bus =~ ^unix:path=([^,]+) ]]
+ DBUS_UNIX_SOCKET=/run/user/1000/bus
+ [[ -n /run/user/1000/bus ]]
+ DBUS_CONFIG_FLAGS=(-e DBUS_SESSION_BUS_ADDRESS=$(echo $DBUS_SESSION_BUS_ADDRESS | sed 's/'$LOCAL_UID'/'$CONTAINER_UID'/') -v $DBUS_UNIX_SOCKET:$(echo $DBUS_UNIX_SOCKET | sed 's/'$LOCAL_UID'/'$CONTAINER_UID'/'))
++ echo unix:path=/run/user/1000/bus
++ sed s/1000/1000/
++ echo /run/user/1000/bus
++ sed s/1000/1000/
+ '[' XOrg == Wayland ']'
+ '[' XOrg == XWayland ']'
+ '[' XOrg == XOrg ']'
+ '[' -z /run/user/1000/gdm/Xauthority ']'
+ WM_FLAGS=(-e XAUTHORITY=$(echo $XAUTHORITY | sed 's/'$LOCAL_UID'/'$CONTAINER_UID'/') -v /tmp/.X11-unix:/tmp/.X11-unix -v $XAUTHORITY:$(echo $XAUTHORITY | sed 's/'$LOCAL_UID'/'$CONTAINER_UID'/'))
++ echo /run/user/1000/gdm/Xauthority
++ sed s/1000/1000/
++ echo /run/user/1000/gdm/Xauthority
++ sed s/1000/1000/
+ '[' XOrg == XOrg ']'
+ unset WINE_EXPERIMENTAL_WAYLAND
+ '[' podman == podman ']'
++ podman volume ls
++ grep zwift-mike
+ [[ -z local       zwift-mike ]]
+ PODMAN_FLAGS=(--userns keep-id:uid=$CONTAINER_UID,gid=$CONTAINER_GID)
++ podman run -d --rm --privileged --network bridge --name zwift-mike --security-opt label=disable --hostname pop-os -e DISPLAY=:1 -e ZWIFT_UID=1000 -e ZWIFT_GID=1000 -e PULSE_SERVER=/run/user/1000/pulse/native -v zwift-mike:/home/user/.wine/drive_c/users/user/Documents/Zwift -v /run/user/1000/pulse:/run/user/1000/pulse --env-file /home/mike/.config/zwift/config --device=nvidia.com/gpu=all -e DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1000/bus -v /run/user/1000/bus:/run/user/1000/bus -e XAUTHORITY=/run/user/1000/gdm/Xauthority -v /tmp/.X11-unix:/tmp/.X11-unix -v /run/user/1000/gdm/Xauthority:/run/user/1000/gdm/Xauthority --userns keep-id:uid=1000,gid=1000 docker.io/netbrain/zwift:latest
Error: crun: error stat'ing file `/lib/i386-linux-gnu/libEGL_nvidia.so.550.54.14`: No such file or directory: OCI runtime attempted to invoke a command that was not found
+ CONTAINER=
+ '[' 127 -ne 0 ']'
+ msgbox error 'Error can'\''t run zwift, check variables!' 10
+ TYPE=error
+ MSG='Error can'\''t run zwift, check variables!'
+ TIMEOUT=10
+ RED='\033[0;31m'
+ NC='\033[0m'
+ BOLD='\033[1m'
+ UNDERLINE='\033[4m'
+ case $1 in
+ echo -e '\033[0;31m\033[1m\033[4mError - Error can'\''t run zwift, check variables!\033[0m'
Error - Error can't run zwift, check variables!
+ '[' 10 -eq 0 ']'
+ sleep 10
+ exit 0

Distribution Details

OS: Pop!_OS 22.04 LTS x86_64 Kernel: 6.9.3-76060903-generic Shell: bash 5.1.16 Resolution: 3440x1440, 2560x1440 DE: GNOME 42.9 WM: Mutter Terminal: alacritty CPU: AMD Ryzen 9 9950X (32) @ 5.752GHz GPU: AMD ATI 17:00.0 Device 13c0 GPU: NVIDIA GeForce RTX 2070 SUPER Memory: 3797MiB / 31105MiB

Reproduction steps

  1. Install nvidia driver 560 on top of an existing 555 install
  2. downloaded latest zwift.sh via instructions on github
  3. run zwift ...
netbrain commented 6 hours ago

If I were to guess, I'd say there's probably an issue with your nvidia-container-toolkit.

Other than that maybe the container is on a too old version of something...

Try debugging the container toolkit first by making sure you can run gpu related tasks in a container

https://training.tensorworks.com.au/cn/cn001/activities/05-running-gpu-accelerated-linux-containers-with-the-nvidia-container-toolkit#running-containers-using-the-nvidia-container-toolkit

msmurphy commented 6 hours ago

Yeah so it turns out every time you upgrade your drivers you need to regenerate some things in the nvidia toolkit. The issue I'm having is the toolkit is segfaulting when I try to generate the file with. Since I'm on a pop os specific version of the toolkit my guess is the toolkit is out of date for the newest nvidia drivers.

mike@pop-os:~$ sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml
INFO[0000] Selecting /dev/nvidia0 as /dev/nvidia0
INFO[0000] Selecting /dev/dri/card2 as /dev/dri/card2
WARN[0000] Could not locate /dev/dri/controlD66: pattern /dev/dri/controlD66 not found
INFO[0000] Selecting /dev/dri/renderD129 as /dev/dri/renderD129
INFO[0000] Selecting /var/run/nvidia-persistenced/socket as /var/run/nvidia-persistenced/socket
WARN[0000] Could not locate /var/run/nvidia-fabricmanager/socket: pattern /var/run/nvidia-fabricmanager/socket not found
WARN[0000] Could not locate /tmp/nvidia-mps: pattern /tmp/nvidia-mps not found
INFO[0000] Using driver version 560.35.03
INFO[0000] found 64-bit driver lib: /lib/x86_64-linux-gnu/libnvoptix.so.560.35.03
INFO[0000] found 64-bit driver lib: /lib/x86_64-linux-gnu/libnvidia-wayland-client.so.560.35.03
INFO[0000] found 64-bit driver lib: /lib/x86_64-linux-gnu/libnvidia-tls.so.560.35.03
INFO[0000] found 64-bit driver lib: /lib/x86_64-linux-gnu/libnvidia-rtcore.so.560.35.03
INFO[0000] found 64-bit driver lib: /lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.560.35.03
INFO[0000] found 64-bit driver lib: /lib/x86_64-linux-gnu/libnvidia-pkcs11-openssl3.so.560.35.03
INFO[0000] found 64-bit driver lib: /lib/x86_64-linux-gnu/libnvidia-opticalflow.so.560.35.03
INFO[0000] found 64-bit driver lib: /lib/x86_64-linux-gnu/libnvidia-opencl.so.560.35.03
INFO[0000] found 64-bit driver lib: /lib/x86_64-linux-gnu/libnvidia-nvvm.so.560.35.03
INFO[0000] found 64-bit driver lib: /lib/x86_64-linux-gnu/libnvidia-ngx.so.560.35.03
INFO[0000] found 64-bit driver lib: /lib/x86_64-linux-gnu/libnvidia-ml.so.560.35.03
INFO[0000] found 64-bit driver lib: /lib/x86_64-linux-gnu/libnvidia-gpucomp.so.560.35.03
INFO[0000] found 64-bit driver lib: /lib/x86_64-linux-gnu/libnvidia-glvkspirv.so.560.35.03
INFO[0000] found 64-bit driver lib: /lib/x86_64-linux-gnu/libnvidia-glsi.so.560.35.03
INFO[0000] found 64-bit driver lib: /lib/x86_64-linux-gnu/libnvidia-glcore.so.560.35.03
INFO[0000] found 64-bit driver lib: /lib/x86_64-linux-gnu/libnvidia-fbc.so.560.35.03
INFO[0000] found 64-bit driver lib: /lib/x86_64-linux-gnu/libnvidia-encode.so.560.35.03
INFO[0000] found 64-bit driver lib: /lib/x86_64-linux-gnu/libnvidia-eglcore.so.560.35.03
INFO[0000] found 64-bit driver lib: /lib/x86_64-linux-gnu/libnvidia-cfg.so.560.35.03
INFO[0000] found 64-bit driver lib: /lib/x86_64-linux-gnu/libnvidia-allocator.so.560.35.03
INFO[0000] found 64-bit driver lib: /lib/x86_64-linux-gnu/libnvcuvid.so.560.35.03
INFO[0000] found 64-bit driver lib: /lib/x86_64-linux-gnu/libcudadebugger.so.560.35.03
INFO[0000] found 64-bit driver lib: /lib/x86_64-linux-gnu/libcuda.so.560.35.03
INFO[0000] found 64-bit driver lib: /lib/x86_64-linux-gnu/libGLX_nvidia.so.560.35.03
INFO[0000] found 64-bit driver lib: /lib/x86_64-linux-gnu/libGLESv2_nvidia.so.560.35.03
INFO[0000] found 64-bit driver lib: /lib/x86_64-linux-gnu/libGLESv1_CM_nvidia.so.560.35.03
INFO[0000] found 64-bit driver lib: /lib/x86_64-linux-gnu/libEGL_nvidia.so.560.35.03
INFO[0000] found 32-bit driver lib: /lib/i386-linux-gnu/libnvidia-tls.so.560.35.03
INFO[0000] found 32-bit driver lib: /lib/i386-linux-gnu/libnvidia-ptxjitcompiler.so.560.35.03
INFO[0000] found 32-bit driver lib: /lib/i386-linux-gnu/libnvidia-opticalflow.so.560.35.03
INFO[0000] found 32-bit driver lib: /lib/i386-linux-gnu/libnvidia-opencl.so.560.35.03
INFO[0000] found 32-bit driver lib: /lib/i386-linux-gnu/libnvidia-nvvm.so.560.35.03
INFO[0000] found 32-bit driver lib: /lib/i386-linux-gnu/libnvidia-ml.so.560.35.03
INFO[0000] found 32-bit driver lib: /lib/i386-linux-gnu/libnvidia-gpucomp.so.560.35.03
INFO[0000] found 32-bit driver lib: /lib/i386-linux-gnu/libnvidia-glvkspirv.so.560.35.03
INFO[0000] found 32-bit driver lib: /lib/i386-linux-gnu/libnvidia-glsi.so.560.35.03
INFO[0000] found 32-bit driver lib: /lib/i386-linux-gnu/libnvidia-glcore.so.560.35.03
INFO[0000] found 32-bit driver lib: /lib/i386-linux-gnu/libnvidia-fbc.so.560.35.03
INFO[0000] found 32-bit driver lib: /lib/i386-linux-gnu/libnvidia-encode.so.560.35.03
INFO[0000] found 32-bit driver lib: /lib/i386-linux-gnu/libnvidia-eglcore.so.560.35.03
INFO[0000] found 32-bit driver lib: /lib/i386-linux-gnu/libnvcuvid.so.560.35.03
INFO[0000] found 32-bit driver lib: /lib/i386-linux-gnu/libcuda.so.560.35.03
INFO[0000] found 32-bit driver lib: /lib/i386-linux-gnu/libGLX_nvidia.so.560.35.03
INFO[0000] found 32-bit driver lib: /lib/i386-linux-gnu/libGLESv2_nvidia.so.560.35.03
INFO[0000] found 32-bit driver lib: /lib/i386-linux-gnu/libGLESv1_CM_nvidia.so.560.35.03
INFO[0000] found 32-bit driver lib: /lib/i386-linux-gnu/libEGL_nvidia.so.560.35.03
INFO[0000] Selecting /dev/nvidia-modeset as /dev/nvidia-modeset
INFO[0000] Selecting /dev/nvidia-uvm-tools as /dev/nvidia-uvm-tools
INFO[0000] Selecting /dev/nvidia-uvm as /dev/nvidia-uvm
INFO[0000] Selecting /dev/nvidiactl as /dev/nvidiactl
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x2c pc=0x512c22]

goroutine 1 [running]:
github.com/sirupsen/logrus.(*Logger).Logf(0xc0003fd990?, 0x1?, {0x6afa84?, 0x1?}, {0xc00019a868?, 0x203000?, 0x203000?})
    /build/nvidia-container-toolkit-K2baJ7/nvidia-container-toolkit-1.12.1/gopath/pkg/mod/github.com/sirupsen/logrus@v1.9.0/logger.go:152 +0x22
github.com/sirupsen/logrus.(*Logger).Warnf(...)
    /build/nvidia-container-toolkit-K2baJ7/nvidia-container-toolkit-1.12.1/gopath/pkg/mod/github.com/sirupsen/logrus@v1.9.0/logger.go:178
github.com/NVIDIA/nvidia-container-toolkit/internal/lookup.library.Locate({0x0, {0x6fd0c8, 0xc00012a190}, {0x6fd670, 0xc000144000}}, {0x6a857d, 0x14})
    /build/nvidia-container-toolkit-K2baJ7/nvidia-container-toolkit-1.12.1/internal/lookup/library.go:60 +0x1ab
github.com/NVIDIA/nvidia-container-toolkit/internal/discover.(*mounts).Mounts(0xc00011c360)
    /build/nvidia-container-toolkit-K2baJ7/nvidia-container-toolkit-1.12.1/internal/discover/mounts.go:71 +0x52b
github.com/NVIDIA/nvidia-container-toolkit/internal/discover.list.Mounts({{0xc000124280?, 0xc000138480?, 0xc00019ae78?}})
    /build/nvidia-container-toolkit-K2baJ7/nvidia-container-toolkit-1.12.1/internal/discover/list.go:59 +0xa9
github.com/NVIDIA/nvidia-container-toolkit/internal/discover.list.Mounts({{0xc00012d800?, 0xc000138500?, 0x4?}})
    /build/nvidia-container-toolkit-K2baJ7/nvidia-container-toolkit-1.12.1/internal/discover/list.go:59 +0xa9
github.com/NVIDIA/nvidia-container-toolkit/internal/discover.list.Mounts({{0xc000124340?, 0xc00012a3c0?, 0x80a2078500000000?}})
    /build/nvidia-container-toolkit-K2baJ7/nvidia-container-toolkit-1.12.1/internal/discover/list.go:59 +0xa9
github.com/NVIDIA/nvidia-container-toolkit/internal/edits.FromDiscoverer({0x6fdaa0, 0xc00011e6a8})
    /build/nvidia-container-toolkit-K2baJ7/nvidia-container-toolkit-1.12.1/internal/edits/edits.go:57 +0xc2
github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk/cdi/generate.command.generateSpec({0x69c876?}, {0x0, 0x0}, {0xc000016180, 0x13}, {0x6fd918, 0xc00007e8a0})
    /build/nvidia-container-toolkit-K2baJ7/nvidia-container-toolkit-1.12.1/cmd/nvidia-ctk/cdi/generate/generate.go:265 +0xdca
github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk/cdi/generate.command.run({0x69bb76?}, 0xc0000e54f0?, 0xc0000b4460)
    /build/nvidia-container-toolkit-K2baJ7/nvidia-container-toolkit-1.12.1/cmd/nvidia-ctk/cdi/generate/generate.go:136 +0x13f
github.com/NVIDIA/nvidia-container-toolkit/cmd/nvidia-ctk/cdi/generate.command.build.func2(0xc0000e54a0?)
    /build/nvidia-container-toolkit-K2baJ7/nvidia-container-toolkit-1.12.1/cmd/nvidia-ctk/cdi/generate/generate.go:74 +0x27
github.com/urfave/cli/v2.(*Command).Run(0xc000014c60, 0xc000012840)
    /build/nvidia-container-toolkit-K2baJ7/nvidia-container-toolkit-1.12.1/gopath/pkg/mod/github.com/urfave/cli/v2@v2.3.0/command.go:163 +0x5bb
github.com/urfave/cli/v2.(*App).RunAsSubcommand(0xc0000cf040, 0xc000012780)
    /build/nvidia-container-toolkit-K2baJ7/nvidia-container-toolkit-1.12.1/gopath/pkg/mod/github.com/urfave/cli/v2@v2.3.0/app.go:434 +0xc8a
github.com/urfave/cli/v2.(*Command).startApp(0xc000014b40, 0xc000012780)
    /build/nvidia-container-toolkit-K2baJ7/nvidia-container-toolkit-1.12.1/gopath/pkg/mod/github.com/urfave/cli/v2@v2.3.0/command.go:278 +0x713
github.com/urfave/cli/v2.(*Command).Run(0xc000012540?, 0x3?)
    /build/nvidia-container-toolkit-K2baJ7/nvidia-container-toolkit-1.12.1/gopath/pkg/mod/github.com/urfave/cli/v2@v2.3.0/command.go:94 +0xba
github.com/urfave/cli/v2.(*App).RunContext(0xc0000ced00, {0x6fddd0?, 0xc00001a0b0}, {0xc000012080, 0x4, 0x4})
    /build/nvidia-container-toolkit-K2baJ7/nvidia-container-toolkit-1.12.1/gopath/pkg/mod/github.com/urfave/cli/v2@v2.3.0/app.go:313 +0xb48
github.com/urfave/cli/v2.(*App).Run(...)
    /build/nvidia-container-toolkit-K2baJ7/nvidia-container-toolkit-1.12.1/gopath/pkg/mod/github.com/urfave/cli/v2@v2.3.0/app.go:224
main.main()
    /build/nvidia-container-toolkit-K2baJ7/nvidia-container-toolkit-1.12.1/cmd/nvidia-ctk/main.go:85 +0x445
mike@pop-os:~$
netbrain commented 6 hours ago

Yeah, looks like that might be the case. I guess you should revert the Nvidia drivers in the meantime or temporarily run zwift on another gpu if your on a multi/dual gpu setup.

netbrain commented 6 hours ago

Or, you could try installing and maintaining zwift yourself through steam/proton or wine-bottles or lutris or ...

msmurphy commented 5 hours ago

Was able to generate the file after upgrading the container toolkit. Had to add this entry in /etc/apt/preferences.d/pop-default-settings to prioritize the nvidia depot over the pop one. Went from version 1.12 to 1.16. My guess is 1.12 is incompatible with the latest driver. I haven't had a chance to try the zwift container yet, will test after work.


Package: *
Pin: origin nvidia.github.io
Pin-Priority: 1002
msmurphy commented 2 hours ago

Was able to generate the file after upgrading the container toolkit. Had to add this entry in /etc/apt/preferences.d/pop-default-settings to prioritize the nvidia depot over the pop one. Went from version 1.12 to 1.16. My guess is 1.12 is incompatible with the latest driver. I haven't had a chance to try the zwift container yet, will test after work.


Package: *
Pin: origin nvidia.github.io
Pin-Priority: 1002

This 100% worked and I can run zwift in the the container now. It also fixed the performance issues I was having. I highly recommend anyone running pop os make this change.