Closed dalgibbard closed 1 week ago
I had the same problem, @dalgibbard's solution fixed it for me.
I did an apt-get
update
/upgrade
to fix it at first and somehow that hosed the entire jail, but I quickly rolled back and applied the --bind
fix
Glad to hear you were able to get it working @dalgibbard.
I found the same error message mentioned in: https://github.com/NVIDIA/libnvidia-container/issues/224. But I'm leaving investigating the root cause and implementing a fix for this issue up to the community. I don't have an nvidia GPU in my system and I'm not looking forward to another blind, back-and-forth, trial and error fixing process akin to https://github.com/Jip-Hop/jailmaker/issues/4.
I had the same problem, @dalgibbard's solution fixed it for me.
I did an
apt-get
update
/upgrade
to fix it at first and somehow that hosed the entire jail, but I quickly rolled back and applied the--bind
fix
i still couldn't get immich to use graphics
to find out the stuff
nvidia-container-cli list
jlmkr edit docker made changes, saved, jlmkr restart docker
--bind-ro=/usr/lib/x86_64-linux-gnu/nvidia/current
--bind-ro=/usr/lib/x86_64-linux-gnu/nvidia/current/libnvidia-nvvm.so.545.23.08
--bind-ro=/usr/lib/x86_64-linux-gnu/nvidia/current/libnvidia-ptxjitcompiler.so.545.23.08
--bind-ro=/usr/lib/x86_64-linux-gnu/nvidia/current/libcuda.so.545.23.08
--bind-ro=/usr/lib/x86_64-linux-gnu/nvidia/current/libnvidia-encode.so.545.23.08
--bind-ro=/usr/lib/x86_64-linux-gnu/nvidia/current/libnvcuvid.so.545.23.08
--bind-ro=/usr/bin/nvidia-persistenced
--bind=/dev/nvidia0
--bind-ro=/usr/lib/x86_64-linux-gnu/nvidia/current/libnvidia-cfg.so.545.23.08
--bind=/dev/nvidia-uvm-tools
--bind=/dev/nvidiactl
--bind=/dev/nvidia-uvm
--bind-ro=/usr/lib/x86_64-linux-gnu/nvidia/current/libnvidia-ml.so.545.23.08
--bind-ro=/usr/bin/nvidia-smi
--bind-ro=/usr/lib/nvidia/current/nvidia-smi
when trying to deploy immich i get this
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]]
*noticed when i had this it wouldn't start the docker jail. so i removed it from the config and it worked again
--bind=/dev/nvidia-modeset
jlmkr edit docker made changes, saved, jlmkr restart docker
--bind-ro=/usr/lib/x86_64-linux-gnu/nvidia/current --bind-ro=/usr/lib/x86_64-linux-gnu/nvidia/current/libnvidia-nvvm.so.545.23.08 --bind-ro=/usr/lib/x86_64-linux-gnu/nvidia/current/libnvidia-ptxjitcompiler.so.545.23.08 --bind-ro=/usr/lib/x86_64-linux-gnu/nvidia/current/libcuda.so.545.23.08 --bind-ro=/usr/lib/x86_64-linux-gnu/nvidia/current/libnvidia-encode.so.545.23.08 --bind-ro=/usr/lib/x86_64-linux-gnu/nvidia/current/libnvcuvid.so.545.23.08 --bind-ro=/usr/bin/nvidia-persistenced --bind=/dev/nvidia0 --bind-ro=/usr/lib/x86_64-linux-gnu/nvidia/current/libnvidia-cfg.so.545.23.08 --bind=/dev/nvidia-uvm-tools --bind=/dev/nvidiactl --bind=/dev/nvidia-uvm --bind-ro=/usr/lib/x86_64-linux-gnu/nvidia/current/libnvidia-ml.so.545.23.08 --bind-ro=/usr/bin/nvidia-smi --bind-ro=/usr/lib/nvidia/current/nvidia-smi
You shouldn't ened all these bind mounts, the only one you need is --bind-ro=/usr/lib/x86_64-linux-gnu/nvidia/current
ty will update accordingly
I created a new jail with nvidia gpu passthrough enabled and mounted the suggested directory as ro
. I can run nvidia-smi
and it detects my card inside the container:
nvidia-smi
Mon May 6 16:43:49 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.23.08 Driver Version: 545.23.08 CUDA Version: 12.3 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce GTX 1080 Ti Off | 00000000:2B:00.0 Off | N/A |
| 0% 37C P8 11W / 280W | 0MiB / 11264MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
When I attempt to use the card in Plex, I get the following errors:
[Req#18ae/Transcode] [FFMPEG] - Cannot load libcuda.so.1
[Req#18ae/Transcode] [FFMPEG] - Could not dynamically load CUDA
I did bind-ro
the TrueNAS SCALE dir with the nvidia modules in them:
ls -la /usr/lib/x86_64-linux-gnu/nvidia/current/
total 79835
drwxr-xr-x 2 root root 21 Apr 22 19:20 .
drwxr-xr-x 3 root root 3 May 6 16:02 ..
lrwxrwxrwx 1 root root 12 Nov 6 21:21 libcuda.so -> libcuda.so.1
lrwxrwxrwx 1 root root 20 Nov 6 21:21 libcuda.so.1 -> libcuda.so.545.23.08
-rw-r--r-- 1 root root 29453200 Nov 6 16:49 libcuda.so.545.23.08
lrwxrwxrwx 1 root root 15 Nov 6 21:21 libnvcuvid.so -> libnvcuvid.so.1
lrwxrwxrwx 1 root root 23 Nov 6 21:21 libnvcuvid.so.1 -> libnvcuvid.so.545.23.08
-rw-r--r-- 1 root root 10009920 Nov 6 16:22 libnvcuvid.so.545.23.08
lrwxrwxrwx 1 root root 26 Nov 6 21:21 libnvidia-cfg.so.1 -> libnvidia-cfg.so.545.23.08
-rw-r--r-- 1 root root 274968 Nov 6 16:19 libnvidia-cfg.so.545.23.08
lrwxrwxrwx 1 root root 21 Nov 6 21:21 libnvidia-encode.so -> libnvidia-encode.so.1
lrwxrwxrwx 1 root root 29 Nov 6 21:21 libnvidia-encode.so.1 -> libnvidia-encode.so.545.23.08
-rw-r--r-- 1 root root 252576 Apr 25 09:50 libnvidia-encode.so.545.23.08
lrwxrwxrwx 1 root root 17 Nov 6 21:21 libnvidia-ml.so -> libnvidia-ml.so.1
lrwxrwxrwx 1 root root 25 Nov 6 21:21 libnvidia-ml.so.1 -> libnvidia-ml.so.545.23.08
-rw-r--r-- 1 root root 1992128 Nov 6 16:23 libnvidia-ml.so.545.23.08
lrwxrwxrwx 1 root root 19 Nov 6 21:21 libnvidia-nvvm.so -> libnvidia-nvvm.so.4
lrwxrwxrwx 1 root root 27 Nov 6 21:21 libnvidia-nvvm.so.4 -> libnvidia-nvvm.so.545.23.08
-rw-r--r-- 1 root root 86781944 Nov 6 17:28 libnvidia-nvvm.so.545.23.08
lrwxrwxrwx 1 root root 37 Nov 6 21:21 libnvidia-ptxjitcompiler.so.1 -> libnvidia-ptxjitcompiler.so.545.23.08
-rw-r--r-- 1 root root 26589472 Nov 6 16:55 libnvidia-ptxjitcompiler.so.545.23.08
Here is the nvidia-container-cli
output from the host:
nvidia-container-cli list
/dev/nvidiactl
/dev/nvidia-uvm
/dev/nvidia-uvm-tools
/dev/nvidia-modeset
/dev/nvidia0
/usr/lib/nvidia/current/nvidia-smi
/usr/bin/nvidia-persistenced
/usr/lib/x86_64-linux-gnu/nvidia/current/libnvidia-ml.so.545.23.08
/usr/lib/x86_64-linux-gnu/nvidia/current/libnvidia-cfg.so.545.23.08
/usr/lib/x86_64-linux-gnu/nvidia/current/libcuda.so.545.23.08
/usr/lib/x86_64-linux-gnu/nvidia/current/libnvidia-ptxjitcompiler.so.545.23.08
/usr/lib/x86_64-linux-gnu/nvidia/current/libnvidia-nvvm.so.545.23.08
/usr/lib/x86_64-linux-gnu/nvidia/current/libnvidia-encode.so.545.23.08
/usr/lib/x86_64-linux-gnu/nvidia/current/libnvcuvid.so.545.23.08
Jail config:
startup=1
gpu_passthrough_intel=0
gpu_passthrough_nvidia=1
# Turning off seccomp filtering improves performance at the expense of security
seccomp=1
# Use macvlan networking to provide an isolated network namespace,
# so docker can manage firewall rules
# Alternatively use --network-bridge=br1 instead of --network-macvlan
# Ensure to change eno1/br1 to the interface name you want to use
# You may want to add additional options here, e.g. bind mounts
systemd_nspawn_user_args=--network-bridge=br0
--system-call-filter='add_key keyctl bpf'
--bind='/mnt/tank/containers/:/mnt/containers'
--bind='/mnt/tank/data/apps/:/mnt/data'
--bind='/mnt/tank/media/:/mnt/media'
--bind='/mnt/tank/data/stacks:/opt/stacks'
--bind-ro='/usr/lib/x86_64-linux-gnu/nvidia/current'
# Script to run on the HOST before starting the jail
# Load kernel module and config kernel settings required for docker
pre_start_hook=#!/usr/bin/bash
set -euo pipefail
echo 'PRE_START_HOOK'
echo 1 > /proc/sys/net/ipv4/ip_forward
modprobe br_netfilter
echo 1 > /proc/sys/net/bridge/bridge-nf-call-iptables
echo 1 > /proc/sys/net/bridge/bridge-nf-call-ip6tables
# Only used while creating the jail
distro=debian
release=bookworm
# Install docker inside the jail:
# https://docs.docker.com/engine/install/debian/#install-using-the-repository
# NOTE: this script will run in the host networking namespace and ignores
# all systemd_nspawn_user_args such as bind mounts
initial_setup=#!/usr/bin/bash
set -euo pipefail
apt-get update && apt-get -y install ca-certificates curl host
install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/debian/gpg -o /etc/apt/keyrings/docker.asc
chmod a+r /etc/apt/keyrings/docker.asc
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/debian \
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
tee /etc/apt/sources.list.d/docker.list > /dev/null
apt-get update
apt-get install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
# You generally will not need to change the options below
systemd_run_default_args=--property=KillMode=mixed
--property=Type=notify
--property=RestartForceExitStatus=133
--property=SuccessExitStatus=133
--property=Delegate=yes
--property=TasksMax=infinity
--collect
--setenv=SYSTEMD_NSPAWN_LOCK=0
systemd_nspawn_default_args=--keep-unit
--quiet
--boot
--bind-ro=/sys/module
--inaccessible=/sys/module/apparmor
Plex docker compose file:
services:
plex:
container_name: plex
image: plexinc/pms-docker
restart: unless-stopped
network_mode: host
volumes:
- /mnt/data/plex/configs:/config
- /mnt/data/plex/transcode:/transcode
- /mnt/media/plex:/data
env_file:
- .env
I can see the library is there. Any ideas?
Did you install the Nvidia runtime and configure Plex to use it?
Yes, just got it working. I missed setting the NVIDIA_* vars in my env file. After that it fires up. So, in addition to above, I had to do the following:
NVIDIA_VISIBLE_DEVICES=all
NVIDIA_DRIVER_CAPABILITIES=compute,video,utility
Working Plex compose.yml:
services:
plex:
container_name: plex
image: plexinc/pms-docker
restart: unless-stopped
network_mode: host
volumes:
- /mnt/data/plex/configs:/config
- /mnt/data/plex/transcode:/transcode
- /mnt/media/plex:/data
deploy:
resources:
reservations:
devices:
- capabilities:
- gpu
env_file:
- .env
networks: {}
Working nvidia-smi
output from TrueNAS host:
Mon May 6 18:35:06 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.23.08 Driver Version: 545.23.08 CUDA Version: 12.3 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce GTX 1080 Ti Off | 00000000:2B:00.0 Off | N/A |
| 0% 45C P2 73W / 280W | 298MiB / 11264MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| | 0 N/A N/A 2658633 C ...lib/plexmediaserver/Plex Transcoder 296MiB | +---------------------------------------------------------------------------------------+
moogle here. glad it helped gratz :}
moogle here. glad it helped gratz :}
Thank you for pointing me in the right direction. 😊
moogle here. glad it helped gratz :}
Thank you for pointing me in the right direction. 😊
np
by the way may i ask how you got ffmpeg installed? i need that for my jellyfin setup to get it to work.
jailmaker/jails/docker/rootfs/usr/lib/jellyfin-ffmpeg/ffmpeg
this location i think
moogle here. glad it helped gratz :}
Thank you for pointing me in the right direction. 😊
np
by the way may i ask how you got ffmpeg installed? i need that for my jellyfin setup to get it to work.
jailmaker/jails/docker/rootfs/usr/lib/jellyfin-ffmpeg/ffmpeg
this location i think
I'll take a look tomorrow. I'll grab the compose file and the docker image and spin it up and see if I can get it to recognize.
moogle here. glad it helped gratz :}
Thank you for pointing me in the right direction. 😊
np by the way may i ask how you got ffmpeg installed? i need that for my jellyfin setup to get it to work. jailmaker/jails/docker/rootfs/usr/lib/jellyfin-ffmpeg/ffmpeg this location i think
I'll take a look tomorrow. I'll grab the compose file and the docker image and spin it up and see if I can get it to recognize.
np. i would really appreciate this when you are able to do so.
i think the graphics card part is setup correctly but it's how to install ffmpeg i don't know when taking into account jailmaker.
In jailmaker we have this
/mnt/tank/jailmaker/jails/docker/rootfs/usr/lib
in jellyfin the ffmpeg points to
FFmpeg path:
/usr/lib/jellyfin-ffmpeg/ffmpeg
The path to the FFmpeg application file or folder containing FFmpeg.
but that jellyfin-ffmpeg/ffmpeg does not exist there at that location.
so i thought i'd have to go jlmkr shell docker then install the repo and the custom jellyfin ffmpeg, but i don't know how.
I'm using debian bookworm for jailmaker docker.
moogle here. glad it helped gratz :}
Thank you for pointing me in the right direction. 😊
np
by the way may i ask how you got ffmpeg installed? i need that for my jellyfin setup to get it to work.
jailmaker/jails/docker/rootfs/usr/lib/jellyfin-ffmpeg/ffmpeg
this location i think
You shouldn't really be modifying the contents of a jail from outside the jail itself, nor should you be modifying the contents of a docker container image from outside the container itself - it's just asking for trouble.
Which Docker image are you using for Jellyfin? I can see there's 3 available and I'd be surprised if they don't come with ffmpeg already installed, especially as jellyfin has its own fork. Looking at the linuxserver.io
image, I can see that it should have jellyfin-ffmpeg5 bundle as part of it - https://github.com/linuxserver/docker-jellyfin/blob/master/package_versions.txt
moogle here. glad it helped gratz :}
Thank you for pointing me in the right direction. 😊
np by the way may i ask how you got ffmpeg installed? i need that for my jellyfin setup to get it to work. jailmaker/jails/docker/rootfs/usr/lib/jellyfin-ffmpeg/ffmpeg this location i think
You shouldn't really be modifying the contents of a jail from outside the jail itself, nor should you be modifying the contents of a docker container image from outside the container itself - it's just asking for trouble.
Which Docker image are you using for Jellyfin? I can see there's 3 available and I'd be surprised if they don't come with ffmpeg already installed, especially as jellyfin has its own fork. Looking at the
linuxserver.io
image, I can see that it should have jellyfin-ffmpeg5 bundle as part of it - https://github.com/linuxserver/docker-jellyfin/blob/master/package_versions.txt
i use linuxserver jellyfin'
but when i enable nvidia hardware acceleration it doesn't work. i think ffmpeg doesn't get detected so it exits.
my error message https://forums.truenas.com/t/qnap-ts-877-truenas-journal/1646/590?u=mooglestiltzkin
How do you do this?
We automatically add the necessary environment variable that will utilise all the features available on a GPU on the host. Once nvidia-container-toolkit is installed on your host you will need to re/create the docker container with the nvidia container runtime --runtime=nvidia and add an environment variable -e NVIDIA_VISIBLE_DEVICES=all (can also be set to a specific gpu's UUID, this can be discovered by running nvidia-smi --query-gpu=gpu_name,gpu_uuid --format=csv ). NVIDIA automatically mounts the GPU and drivers from your host into the container.
tried adding this to environment but still didn't work
NVIDIA_VISIBLE_DEVICES=all
moogle here. glad it helped gratz :}
Thank you for pointing me in the right direction. 😊
np by the way may i ask how you got ffmpeg installed? i need that for my jellyfin setup to get it to work. jailmaker/jails/docker/rootfs/usr/lib/jellyfin-ffmpeg/ffmpeg this location i think
You shouldn't really be modifying the contents of a jail from outside the jail itself, nor should you be modifying the contents of a docker container image from outside the container itself - it's just asking for trouble. Which Docker image are you using for Jellyfin? I can see there's 3 available and I'd be surprised if they don't come with ffmpeg already installed, especially as jellyfin has its own fork. Looking at the
linuxserver.io
image, I can see that it should have jellyfin-ffmpeg5 bundle as part of it - https://github.com/linuxserver/docker-jellyfin/blob/master/package_versions.txti use linuxserver jellyfin'
but when i enable nvidia hardware acceleration it doesn't work. i think ffmpeg doesn't get detected so it exits.
my error message https://forums.truenas.com/t/qnap-ts-877-truenas-journal/1646/590?u=mooglestiltzkin
I think you're going down the wrong track. The error message you linked to doesn't say FFMPEG isn't present, it says FFMPEG exited with an error (Possibly due to transcoding but not necessarily). I am 99.99% certain that FFMPEG is installed and isn't the issue.
So you tried nvidia-smi
and that works, right - good. That means that your jail can see the GPU. Now you need to check if a container within the jail can see the GPU - so run this command:
sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
You should get the same result, it's just running nvidia-smi
within a container instead of on the OS itself. Try that first.
EDIT: Looking at your docker compose script, you don't appear to have set runtime: nvidia
either. That'll be a problem.
jellyfin:
image: lscr.io/linuxserver/jellyfin:latest
runtime: nvidia
container_name: jellyfin
environment:
- PUID=0
- PGID=0
- TZ=newyork
- JELLYFIN_PublishedServerUrl=https://jellyfin.mydomain.duckdns.org #optional
volumes:
- /mnt/docker/data/jellyfin:/config
- /mnt/xxxxx/Videos:/data/media1:ro
- /mnt/xxxxx/Videos:/data/media2:ro
ports:
- 8096:8096
- 8920:8920 #optional
- 7359:7359/udp #optional
# - 1900:1900/udp #optional
networks:
- proxy
deploy:
resources:
reservations:
devices:
- capabilities:
- gpu
restart: unless-stopped
#networks: {}
networks:
proxy:
external: true
Something like that. You also don't need all that stuff about reserving the GPU, either. You can remove all of the stuff in deploy
.
@neoKushan
ty Neo. Now i have a better angle to look at.
I used dockge entered bash, did the command but got this
root@bxxxxxxxxe:/# sudo docker run --rm --runtime=nvidia 00gpus all ubuntu nvidia-smi
bash: sudo: command not found
did a google went to jlmkr shell docker
then docker exec -it
which is pretty much the same as the dockge earlier.
not sure how to sh into the container to be able to run that command
No no, the command is to be run on the jail, not within the container. It'll spin up a new container that only displays that output.
So 'jlmkr shell docker' then
'sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi'
but i think my jail for docker uses debian. so isn't ubuntu incorrect? confused
i deployed docker using the debian docker template https://github.com/Jip-Hop/jailmaker/tree/main/templates/docker
root@docker:~# sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
Unable to find image 'ubuntu:latest' locally
latest: Pulling from library/ubuntu
49xxxxx4a: Pull complete
Digest: sha25xxxxxxxxxxe15
Status: Downloaded newer image for ubuntu:latest
ya thought so. that doesn't look right since i'm not using ubuntu
The container image is based off Ubuntu, that's normal in the docker world. It's fine.
Containers can be based off of other operating systems like arch and will happily run on any docker host.
well after running that it says this
root@docker:~# sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
Unable to find image 'ubuntu:latest' locally
latest: Pulling from library/ubuntu
49xxxxxxa: Pull complete
Digest: sha256xxxxxxxxxxxxxxxfe15
Status: Downloaded newer image for ubuntu:latest
Tue May 7 xxxxxx 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.23.08 Driver Version: 545.23.08 CUDA Version: 12.3 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce GTX 1050 Off | 00000000:12:00.0 Off | N/A |
| 35% 36C P0 N/A / 75W | 0MiB / 2048MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
root@docker:~#
Okay good, so update your docker compose file to add runtime: nvidia
, do another docker compose up
to reload jellyfin and see if that works.
Okay good, so update your docker compose file to add
runtime: nvidia
, do anotherdocker compose up
to reload jellyfin and see if that works.
how to add runtime: nvidia ? is that under environment? I use dockge so
runtime=nvidia
?
I did a search i found this
version: '3'
services:
jellyfin:
image: jellyfin/jellyfin
user: 1000:1000
network_mode: 'host'
volumes:
- /path/to/config:/config
- /path/to/cache:/cache
- /path/to/media:/media
runtime: nvidia
deploy:
resources:
reservations:
devices:
- capabilities: [gpu]
I'll try https://jellyfin.org/docs/general/administration/hardware-acceleration/nvidia/
yeah I did give an example in this post but you might have missed it.
Tried didn't work still.
went to look at other stuff to check troubleshoot https://www.reddit.com/r/selfhosted/comments/w559xa/jellyfin_nvidia_in_docker_a_guide_for_newbies/
root@truenas[~]# jlmkr shell docker
Connected to machine docker. Press ^] three times within 1s to exit session.
root@docker:~# nvidia-smi --query-gpu=gpu_name,gpu_uuid --format=csv
name, uuid
NVIDIA GeForce GTX 1050, GPU-67xxxxx110
root@docker:~#
hm...
You definitely need runtime: nvidia
in your compose file for this to work so don't remove it.
Did you add in NVIDIA_VISIBLE_DEVICES=all
to your environment variables within your compose file as well?
And when you say "it didn't work", can you be more specific and provide logs?
root@docker:~# docker exec -it jellyfin nvidia-smi
OCI runtime exec failed: exec failed: unable to start container process: exec: "nvidia-smi": executable file not found in $PATH: unknown
root@docker:~#
Why are you doing that? nvidia-smi
won't be included in the jellyfin image, so that was never going to work.
Playback Error
This client isn't compatible with the media and the server isn't sending a compatible media format.
07/05/2024
[INF] [28] Jellyfin.Api.Helpers.MediaInfoHelper: User policy for admin. EnablePlaybackRemuxing: True EnableVideoPlaybackTranscoding: True EnableAudioPlaybackTranscoding: True
07/05/2024
[INF] [28] Jellyfin.Api.Helpers.MediaInfoHelper: StreamBuilder.BuildVideoItem( Profile=Anonymous Profile, Path=/data/media1/Anime/Overlord/Season 2/ Overlord II 01.mkv, AudioStreamIndex=null, SubtitleStreamIndex=null ) => ( PlayMethod=Transcode, TranscodeReason=VideoCodecNotSupported, AudioCodecNotSupported ) media:/videos/f6cxxxxxxxxxxxxxxxxxxxxxxcf/master.m3u8?MediaSourceId=f6cxxxxxxxxxxxxxxxxxxxxxcf&VideoCodec=h264&AudioCodec=aac,mp3&AudioStreamIndex=1&VideoBitrate=139616000&AudioBitrate=384000&MaxFramerate=23.976025&api_key=<token>&TranscodingMaxAudioChannels=6&RequireAvc=false&Tag=57f8xxxxxxxxxxxxxxxxxx4ea&SegmentContainer=ts&MinSegments=1&BreakOnNonKeyFrames=True&hevc-level=120&hevc-videobitdepth=10&hevc-profile=main10&TranscodeReasons=VideoCodecNotSupported,%20AudioCodecNotSupported
07/05/2024
[INF] [49] Jellyfin.Api.Controllers.DynamicHlsController: Current HLS implementation doesn't support non-keyframe breaks but one is requested, ignoring that request
07/05/2024
[INF] [49] Jellyfin.Api.Helpers.TranscodingJobHelper: /usr/lib/jellyfin-ffmpeg/ffmpeg -analyzeduration 200M -init_hw_device cuda=cu:0 -filter_hw_device cu -hwaccel cuda -hwaccel_output_format cuda -threads 1 -autorotate 0 -i file:"/data/media1/Anime/Overlord/Season 2/ Overlord II [1080p] [x265] [10Bit] [Dual Audio] [Subbed]/ Overlord II 01.mkv" -autoscale 0 -map_metadata -1 -map_chapters -1 -threads 0 -map 0:0 -map 0:1 -map -0:s -codec:v:0 h264_nvenc -preset p1 -b:v 8219375 -maxrate 8219375 -bufsize 16438750 -g:v:0 72 -keyint_min:v:0 72 -vf "setparams=color_primaries=bt709:color_trc=bt709:colorspace=bt709,scale_cuda=format=yuv420p" -codec:a:0 libfdk_aac -ac 2 -ab 256000 -copyts -avoid_negative_ts disabled -max_muxing_queue_size 2048 -f hls -max_delay 5000000 -hls_time 3 -hls_segment_type mpegts -start_number 0 -hls_segment_filename "/config/data/transcodes/cdxxxxxxxxxxxxxxxxxxxxx01ca99%d.ts" -hls_playlist_type vod -hls_list_size 0 -y "/config/data/transcodes/cdfxxxxxxxxxxxxxxxxxxxx99.m3u8"
07/05/2024
[ERR] [46] Jellyfin.Api.Helpers.TranscodingJobHelper: FFmpeg exited with code 1
07/05/2024
[ERR] [40] Jellyfin.Server.Middleware.ExceptionMiddleware: Error processing request. URL GET /videos/f6cexxxxxxxxxxxxxxxxxxe2cf/hls1/main/0.ts.
07/05/2024
MediaBrowser.Common.FfmpegException: FFmpeg exited with code 1
07/05/2024
at Jellyfin.Api.Helpers.TranscodingJobHelper.StartFfMpeg(StreamState state, String outputPath, String commandLineArguments, HttpRequest request, TranscodingJobType transcodingJobType, CancellationTokenSource cancellationTokenSource, String workingDirectory)
07/05/2024
at Jellyfin.Api.Controllers.DynamicHlsController.GetDynamicSegment(StreamingRequestDto streamingRequest, Int32 segmentId)
07/05/2024
at Jellyfin.Api.Controllers.DynamicHlsController.GetHlsVideoSegment(Guid itemId, String playlistId, Int32 segmentId, String container, Int64 runtimeTicks, Int64 actualSegmentLengthTicks, Nullable`1 static, String params, String tag, String deviceProfileId, String playSessionId, String segmentContainer, Nullable`1 segmentLength, Nullable`1 minSegments, String mediaSourceId, String deviceId, String audioCodec, Nullable`1 enableAutoStreamCopy, Nullable`1 allowVideoStreamCopy, Nullable`1 allowAudioStreamCopy, Nullable`1 breakOnNonKeyFrames, Nullable`1 audioSampleRate, Nullable`1 maxAudioBitDepth, Nullable`1 audioBitRate, Nullable`1 audioChannels, Nullable`1 maxAudioChannels, String profile, String level, Nullable`1 framerate, Nullable`1 maxFramerate, Nullable`1 copyTimestamps, Nullable`1 startTimeTicks, Nullable`1 width, Nullable`1 height, Nullable`1 maxWidth, Nullable`1 maxHeight, Nullable`1 videoBitRate, Nullable`1 subtitleStreamIndex, Nullable`1 subtitleMethod, Nullable`1 maxRefFrames, Nullable`1 maxVideoBitDepth, Nullable`1 requireAvc, Nullable`1 deInterlace, Nullable`1 requireNonAnamorphic, Nullable`1 transcodingMaxAudioChannels, Nullable`1 cpuCoreLimit, String liveStreamId, Nullable`1 enableMpegtsM2TsMode, String videoCodec, String subtitleCodec, String transcodeReasons, Nullable`1 audioStreamIndex, Nullable`1 videoStreamIndex, Nullable`1 context, Dictionary`2 streamOptions)
07/05/2024
at lambda_method548(Closure , Object )
07/05/2024
at Microsoft.AspNetCore.Mvc.Infrastructure.ActionMethodExecutor.TaskOfActionResultExecutor.Execute(IActionResultTypeMapper mapper, ObjectMethodExecutor executor, Object controller, Object[] arguments)
07/05/2024
at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.<InvokeActionMethodAsync>g__Awaited|12_0(ControllerActionInvoker invoker, ValueTask`1 actionResultValueTask)
07/05/2024
at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.<InvokeNextActionFilterAsync>g__Awaited|10_0(ControllerActionInvoker invoker, Task lastTask, State next, Scope scope, Object state, Boolean isCompleted)
07/05/2024
at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.Rethrow(ActionExecutedContextSealed context)
07/05/2024
at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.Next(State& next, Scope& scope, Object& state, Boolean& isCompleted)
07/05/2024
at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.<InvokeInnerFilterAsync>g__Awaited|13_0(ControllerActionInvoker invoker, Task lastTask, State next, Scope scope, Object state, Boolean isCompleted)
07/05/2024
at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.<InvokeNextResourceFilter>g__Awaited|25_0(ResourceInvoker invoker, Task lastTask, State next, Scope scope, Object state, Boolean isCompleted)
07/05/2024
at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.Rethrow(ResourceExecutedContextSealed context)
07/05/2024
at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.Next(State& next, Scope& scope, Object& state, Boolean& isCompleted)
07/05/2024
at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.<InvokeFilterPipelineAsync>g__Awaited|20_0(ResourceInvoker invoker, Task lastTask, State next, Scope scope, Object state, Boolean isCompleted)
07/05/2024
at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.<InvokeAsync>g__Awaited|17_0(ResourceInvoker invoker, Task task, IDisposable scope)
07/05/2024
at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.<InvokeAsync>g__Awaited|17_0(ResourceInvoker invoker, Task task, IDisposable scope)
07/05/2024
at Microsoft.AspNetCore.Routing.EndpointMiddleware.<Invoke>g__AwaitRequestTask|6_0(Endpoint endpoint, Task requestTask, ILogger logger)
07/05/2024
at Jellyfin.Server.Middleware.ServerStartupMessageMiddleware.Invoke(HttpContext httpContext, IServerApplicationHost serverApplicationHost, ILocalizationManager localizationManager)
07/05/2024
at Jellyfin.Server.Middleware.WebSocketHandlerMiddleware.Invoke(HttpContext httpContext, IWebSocketManager webSocketManager)
07/05/2024
at Jellyfin.Server.Middleware.IpBasedAccessValidationMiddleware.Invoke(HttpContext httpContext, INetworkManager networkManager)
07/05/2024
at Jellyfin.Server.Middleware.LanFilteringMiddleware.Invoke(HttpContext httpContext, INetworkManager networkManager, IServerConfigurationManager serverConfigurationManager)
07/05/2024
at Microsoft.AspNetCore.Authorization.Policy.AuthorizationMiddlewareResultHandler.HandleAsync(RequestDelegate next, HttpContext context, AuthorizationPolicy policy, PolicyAuthorizationResult authorizeResult)
07/05/2024
at Microsoft.AspNetCore.Authorization.AuthorizationMiddleware.Invoke(HttpContext context)
07/05/2024
at Jellyfin.Server.Middleware.QueryStringDecodingMiddleware.Invoke(HttpContext httpContext)
07/05/2024
at Swashbuckle.AspNetCore.ReDoc.ReDocMiddleware.Invoke(HttpContext httpContext)
07/05/2024
at Swashbuckle.AspNetCore.SwaggerUI.SwaggerUIMiddleware.Invoke(HttpContext httpContext)
07/05/2024
at Swashbuckle.AspNetCore.Swagger.SwaggerMiddleware.Invoke(HttpContext httpContext, ISwaggerProvider swaggerProvider)
07/05/2024
at Microsoft.AspNetCore.Authentication.AuthenticationMiddleware.Invoke(HttpContext context)
07/05/2024
at Jellyfin.Server.Middleware.RobotsRedirectionMiddleware.Invoke(HttpContext httpContext)
07/05/2024
at Jellyfin.Server.Middleware.LegacyEmbyRouteRewriteMiddleware.Invoke(HttpContext httpContext)
07/05/2024
at Microsoft.AspNetCore.ResponseCompression.ResponseCompressionMiddleware.InvokeCore(HttpContext context)
07/05/2024
at Jellyfin.Server.Middleware.ResponseTimeMiddleware.Invoke(HttpContext context, IServerConfigurationManager serverConfigurationManager)
07/05/2024
at Jellyfin.Server.Middleware.ExceptionMiddleware.Invoke(HttpContext context)
07/05/2024
[INF] [46] Jellyfin.Api.Helpers.MediaInfoHelper: User policy for admin. EnablePlaybackRemuxing: True EnableVideoPlaybackTranscoding: True EnableAudioPlaybackTranscoding: True
07/05/2024
[INF] [46] Jellyfin.Api.Helpers.MediaInfoHelper: StreamBuilder.BuildVideoItem( Profile=Anonymous Profile, Path=/data/media1/Anime/Overlord/Season 2/ Overlord II [1080p] [x265] [10Bit] [Dual Audio] [Subbed]/ Overlord II 01.mkv, AudioStreamIndex=1, SubtitleStreamIndex=3 ) => ( PlayMethod=Transcode, TranscodeReason=VideoCodecNotSupported, AudioCodecNotSupported ) media:/videos/f6cxxxxxxxxxxxxxxe2cf/master.m3u8?MediaSourceId=f6cexxxxxxxxxxxxxxxxe2cf&VideoCodec=h264&AudioCodec=aac,mp3&AudioStreamIndex=1&VideoBitrate=139616000&AudioBitrate=384000&MaxFramerate=23.976025&api_key=<token>&TranscodingMaxAudioChannels=6&RequireAvc=false&Tag=57fxxxxxxxxx24ea&SegmentContainer=ts&MinSegments=1&BreakOnNonKeyFrames=True&hevc-level=120&hevc-videobitdepth=10&hevc-profile=main10&TranscodeReasons=VideoCodecNotSupported,%20AudioCodecNotSupported
07/05/2024
[INF] [35] Jellyfin.Api.Helpers.TranscodingJobHelper: Deleting partial stream file(s) /config/data/transcodes/cdfxxxxxxxxxxxxxxxxxxa99.m3u8
07/05/2024
[20:45:05] [INF] [35] Jellyfin.Api.Controllers.DynamicHlsController: Current HLS implementation doesn't support non-keyframe breaks but one is requested, ignoring that request
at Jellyfin.Server.Middleware.ResponseTimeMiddleware.Invoke(HttpContext context, IServerConfigurationManager serverConfigurationManager)
07/05/2024
at Jellyfin.Server.Middleware.ExceptionMiddleware.Invoke(HttpContext context)
07/05/2024
[INF] [40] Jellyfin.Api.Helpers.TranscodingJobHelper: Deleting partial stream file(s) /config/data/transcodes/1011xxxxxxxxxxxxxxxxxx9d.m3u8
07/05/2024
[INF] [39] Emby.Server.Implementations.Session.SessionManager: Playback stopped reported by app Jellyfin Web 10.8.13 playing Overlord II 01. Stopped at 0 ms
07/05/2024
[20:45:07] [WRN] [39] Jellyfin.Server.Middleware.ResponseTimeMiddleware: Slow HTTP Response from https://jellyfin.xxxxxxxxxxxxx.duckdns.org/Sessions/Playing/Stopped to 192.168.0.152 in 0:00:01.5755675 with Status Code 204
close
jellyfin transcoding setting https://i.imgur.com/BR1uCFD.jpeg
Why are you doing that?
nvidia-smi
won't be included in the jellyfin image, so that was never going to work.
no idea. just trying whatever the guy mentioned x-x; i'm not much of a linux user xd
services:
jellyfin:
image: lscr.io/linuxserver/jellyfin:latest
container_name: jellyfin
environment:
- PUID=0
- PGID=0
- TZ=newyork
- JELLYFIN_PublishedServerUrl=https://jellyfin.xxxxxxxxxxxx.duckdns.org #optional
volumes:
- /mnt/docker/data/jellyfin:/config
- /mnt/xxxxxxxx/Videos:/data/media1:ro
- /mnt/xxxxxxxxxxx/Videos:/data/media2:ro
runtime: nvidia
ports:
- 8096:8096
- 8920:8920 #optional
- 7359:7359/udp #optional
# - 1900:1900/udp #optional
networks:
- proxy
restart: unless-stopped
#networks: {}
networks:
proxy:
external: true
in dockge .env section
NVIDIA_VISIBLE_DEVICES=all
had stopped and redeployed. so it should be active with these settings
tested on a hevc/x265 mkv video file. most of my videos are similar to this type
jellyfin works. even got it to work with jellyseerr.
but i couldn't get the nvidia graphics live transcoding to work ;[
machine learning using nvidia graphics works for immich though.
Okay, those are the logs from jellyfin - there should be a separate set of logs for ffmpeg or transcoding - can you find and post those?
Okay, those are the logs from jellyfin - there should be a separate set of logs for ffmpeg or transcoding - can you find and post those?
not sure where those r but ill try to look
Check for log files in /mnt/docker/data/jellyfin
Okay, those are the logs from jellyfin - there should be a separate set of logs for ffmpeg or transcoding - can you find and post those?
omg it worked.
i was testing some stuff
in the docker compose in environments i added NVIDIA_VISIBLE_DEVICES=all
after making that change it worked.
So, the .env in dockge where i had a NVIDIA_VISIBLE_DEVICES=all
did absolutely nothing apparently. because i can't explain otherwise why the same value did not kick in.
Albeit i had both a environment in the compose, as well as in the UI for dockge. Not sure the implications of that. I just assumed if u had env top, and env both they would both be active. Guess not.
Anyway that's what fixed it.
I think the solution overall was a combination of steps you suggested
sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
environment:
- NVIDIA_VISIBLE_DEVICES=all
runtime: nvidia
hope this helps the next person that comes along and ran into my issue.
ty so much @neoKushan and @dasunsrule32
Glad you got it working! In your compose file, did you add something like this to actually apply the .env file? 😅
env_file:
- ./.env
Glad you got it working! In your compose file, did you add something like this to actually apply the .env file? 😅
env_file: - ./.env
ty
no i did not. did not realize i had to. i'm new to dockge. i was recently recommended it.
i'm more familiar with portainer.
we had to use stacks.env in our compose. thx for the reminder i will look into that.
Portainer sits on top of docker, you can still use it if you want. Just spin up portainer and log into that.
portainer:
image: portainer/portainer-ce
command: -H unix:///var/run/docker.sock
container_name: portainer
restart: always
ports:
- "9000:9000"
- "8000:8000"
environment:
- PGID=999
- PUID=3001
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- /mnt/docker/data/portainer:/data
Add that to your compose and it should be accessible via http://192.168.x.x:9000 (replace with your server's IP)
Okay, those are the logs from jellyfin - there should be a separate set of logs for ffmpeg or transcoding - can you find and post those?
omg it worked.
i was testing some stuff
in the docker compose in environments i added NVIDIA_VISIBLE_DEVICES=all
after making that change it worked.
So, the .env in dockge where i had a
NVIDIA_VISIBLE_DEVICES=all
did absolutely nothing apparently. because i can't explain otherwise why the same value did not kick in.Albeit i had both a environment in the compose, as well as in the UI for dockge. Not sure the implications of that. I just assumed if u had env top, and env both they would both be active. Guess not.
Anyway that's what fixed it.
I think the solution overall was a combination of steps you suggested
sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
environment: - NVIDIA_VISIBLE_DEVICES=all runtime: nvidia
hope this helps the next person that comes along and ran into my issue.
ty so much @neoKushan and @dasunsrule32
Nice! I don't know that you need the runtime because it's defined in the docker daemon config, but it can't hurt to have it.
cat /etc/docker/daemon.json
{
"runtimes": {
"nvidia": {
"args": [],
"path": "nvidia-container-runtime"
}
}
}
Just to be safe, I did add it to my compose.yml, but it didn't affect it either way. The big one is the NVIDIA_*
vars to enable the GPU in the container.
Nice! I don't know that you need the runtime because it's defined in the docker daemon config, but it can't hurt to have it.
mine is same like yours in that location
{
"runtimes": {
"nvidia": {
"args": [],
"path": "nvidia-container-runtime"
}
}
}
thx for the update
Would be great if anyone could focus on getting nvidia passthrough working again without having to do workarounds and provide a PR because this issue is getting too long IMHO. 😅
Would be great if anyone could focus on getting nvidia passthrough working again without having to do workarounds and provide a PR because this issue is getting too long IMHO. 😅
Lol yeah, can we take the unrelated issues offline? I do plan to make a PR for this but keep forgetting. I'll stick it in my calendar as a reminder and see if I can come up with something tomorrow.
Would be great if anyone could focus on getting nvidia passthrough working again without having to do workarounds and provide a PR because this issue is getting too long IMHO. 😅
I have a custom config for nvidia that I'm testing now. I'll push it soon with example instructions how to deploy.
I can confirm that the config is working out of the box with Nvidia. I just created a jail and booted Plex on it and hw transcoding is working. I'll submit it shortly.
This is the PR, I'm going to work on adding documentation now as well.
@dasunsrule32 reports not running into the "file too short" error on SCALE 24.04 when using the pre-release version of jailmaker from the develop branch:
https://github.com/Jip-Hop/jailmaker/pull/163#issuecomment-2101517482
I don't think I added anything in particular to fix the "file too short" error though...
Latest version of jailmaker (1.1.5) As per title; in Dragonfish 24.04.0, Nvidia passthrough seems to be broken --
nvidia-smi
working fine on host, but inside container it gives:Seems the script uses
nvidia-container-cli list
to find nvidia files which need mounting, but container expects files outside of this:Note that this list doesn't include the file the container is expecting.
Adding a manual mount to my jail's config for
--bind-ro=/usr/lib/x86_64-linux-gnu/nvidia/current
resolved it though; not sure if that's a good idea or not, but it works at least :)