Closed JurgenCruz closed 5 months ago
Thanks for opening your first issue here! Be sure to follow the relevant issue templates, or risk having this issue marked as invalid.
Does opensuse have SELinux? No issues using Nvidia with jellyfin here.
From what I can read it does have SELinux, but is there a command I can run to verify?
I believe sestatus
will show it.
sestatus
is not found. I do have AppGuard though. should I look into that?
Know that I think of it, I ran docker exec -it jellyfin ldconfig
after reading the official docker instructions trying to troubleshoot this problem. Could that have impacted?
I just spawned an official jellyfin docker image and ther it is working. however the commands are different.
Official Image:
/usr/lib/jellyfin-ffmpeg/ffmpeg -analyzeduration 200M -fflags +genpts -i file:"/MediaCenter/media/movies/movie.mkv" -map_metadata -1 -map_chapters -1 -threads 0 -map 0:0 -map 0:1 -map -0:s -codec:v:0 copy -bsf:v h264_mp4toannexb -start_at_zero -codec:a:0 libfdk_aac -ac 2 -ab 384000 -af "volume=2" -copyts -avoid_negative_ts disabled -max_muxing_queue_size 2048 -f hls -max_delay 5000000 -hls_time 6 -hls_segment_type mpegts -start_number 0 -hls_segment_filename "/config/transcodes/d4f88456e2df8a53fbc108a17da107f2%d.ts" -hls_playlist_type vod -hls_list_size 0 -y "/config/transcodes/d4f88456e2df8a53fbc108a17da107f2.m3u8"
linuxserver image:
/usr/lib/jellyfin-ffmpeg/ffmpeg -analyzeduration 200M -init_hw_device cuda=cu:0 -filter_hw_device cu -hwaccel cuda -hwaccel_output_format cuda -threads 1 -autorotate 0 -canvas_size 1920x800 -i file:"/MediaCenter/media/movies/movie.mkv" -autoscale 0 -map_metadata -1 -map_chapters -1 -threads 0 -map 0:0 -map 0:1 -map -0:0 -codec:v:0 h264_nvenc -preset p1 -b:v 11162772 -maxrate 11162772 -bufsize 22325544 -profile:v:0 high -g:v:0 72 -keyint_min:v:0 72 -filter_complex "[0:5]scale=s=1920x800:flags=fast_bilinear,format=yuva420p,hwupload=derive_device=cuda[sub];[0:0]setparams=color_primaries=bt709:color_trc=bt709:colorspace=bt709,scale_cuda=format=yuv420p[main];[main][sub]overlay_cuda=eof_action=endall:shortest=1:repeatlast=0" -start_at_zero -codec:a:0 libfdk_aac -ac 2 -ab 384000 -af "volume=2" -copyts -avoid_negative_ts disabled -max_muxing_queue_size 2048 -f hls -max_delay 5000000 -hls_time 3 -hls_segment_type mpegts -start_number 0 -hls_segment_filename "/config/data/transcodes/6387e39cbb8b1411eb28a3a29fa9c0af%d.ts" -hls_playlist_type vod -hls_list_size 0 -y "/config/data/transcodes/6387e39cbb8b1411eb28a3a29fa9c0af.m3u8"
not sure why there is a difference but the second one is trying to use Cuda.
Nvidia uses cuda for transcoding. The top command to me looks like it's not using your GPU.
It seems like something either opensuse is doing or you've done with the creation of the container is messing with the his. Realistically you shouldn't need to adjust permissions when adding Nvidia in as that handles it all for you.
Nvidia uses cuda for transcoding. The top command to me looks like it's not using your GPU.
I agree, I can confirm it is not using because nvidia-smi
shows no usage. I won't be fiddling with the official image. The linuxserver's image does seem to be trying to use CUDA but not able to. the root
user inside the container can execute nvidia-smi inside just fine. it is just the abc
user that doesn't seem to have permissions.
my understanding is that PUID and PGID are just used for accessing the volumes with the right permissions, right? and in any case I added both users to the video
group if that is relevant. not sure what to do. I'll try recreating the container from scratch and report back.
Ok, something even more weird. I created another container with linuxserver's image and this uses yet another ffmpeg command:
/usr/lib/jellyfin-ffmpeg/ffmpeg -analyzeduration 200M -fflags +genpts -f matroska,webm -i file:"/MediaCenter/media/movies/movie.mkv" -map_metadata -1 -map_chapters -1 -threads 0 -map 0:0 -map 0:1 -map -0:s -codec:v:0 copy -bsf:v h264_mp4toannexb -start_at_zero -codec:a:0 libfdk_aac -ac 2 -ab 384000 -af "volume=2" -copyts -avoid_negative_ts disabled -max_muxing_queue_size 2048 -f hls -max_delay 5000000 -hls_time 6 -hls_segment_type mpegts -start_number 0 -hls_segment_filename "/config/data/transcodes/4b86aa41bde3e2e2d2cf29b80bd59710%d.ts" -hls_playlist_type vod -hls_list_size 0 -y "/config/data/transcodes/4b86aa41bde3e2e2d2cf29b80bd59710.m3u8"
No CUDA being used. I tried running nvidia-smi
as root
and it worked. Then as abc
and same thing about permissions. So now I don't know what I did for the first container to enable CUDA. maybe the ldconfig
?
Are you setting your transcoding settings within jellyfin? it won't use it by default.
Yes, I enabled Nvidia NVEC HW acceleration and enabled all codecs. Then, enabled NVDEC decoder and hardware encoding. Finally enabled tone mapping.
I also tried the ldconfig in second image and didn't change anything, so I don't think that is it.
[2024-03-10 21:20:47.925 +00:00] [INF] [35] Jellyfin.Api.Helpers.TranscodingJobHelper: "/usr/lib/jellyfin-ffmpeg/ffmpeg" "-analyzeduration 200M -init_hw_device cuda=cu:0 -filter_hw_device cu -hwaccel cuda -hwaccel_output_format cuda -threads 1 -autorotate 0 -i file:\"/data/media/tv/tvshow/Season 01/S01E01 - Episode 1.mp4\" -autoscale 0 -map_metadata -1 -map_chapters -1 -threads 0 -map 0:0 -map 0:1 -map -0:s -codec:v:0 h264_nvenc -preset p1 -b:v 1274180 -maxrate 1274180 -bufsize 2548360 -profile:v:0 high -g:v:0 75 -keyint_min:v:0 75 -vf \"setparams=color_primaries=bt709:color_trc=bt709:colorspace=bt709,scale_cuda=w=720:h=404:format=yuv420p\" -codec:a:0 copy -copyts -avoid_negative_ts disabled -max_muxing_queue_size 2048 -f hls -max_delay 5000000 -hls_time 3 -hls_segment_type mpegts -start_number 0 -hls_segment_filename \"/transcode/e06717188347f2383877c7b391c753fc%d.ts\" -hls_playlist_type vod -hls_list_size 0 -y \"/transcode/e06717188347f2383877c7b391c753fc.m3u8\""
This is working fine in my deployment of jellyfin. Maybe you're trying to run a codec that your gpu doesn't support? Unfortunately I can't replicate your issue.
Can you share your compose file? or how are you deploying your container?
Unfortunately it's on my test unraid system which has my nvidia card in. I can give you the docker run though (just ignore the unraid specific bits):
-d
--name='jellyfin'
--net='proxy'
--cpuset-cpus='1,2,3'
-e TZ="Europe/London"
-e HOST_OS="Unraid"
-e HOST_HOSTNAME="Server"
-e HOST_CONTAINERNAME="jellyfin"
-e 'NVIDIA_VISIBLE_DEVICES'='all'
-e 'PUID'='99'
-e 'PGID'='100'
-e 'UMASK'='022'
-l net.unraid.docker.managed=dockerman
-l net.unraid.docker.webui='http://[IP]:[PORT:8096]'
-l net.unraid.docker.icon='https://raw.githubusercontent.com/linuxserver/docker-templates/master/linuxserver.io/img/jellyfin-logo.png'
-p '8097:8096/tcp'
-v '/mnt/user/data/media/':'/data/media':'rw'
-v '/mnt/disks/ssd/.appdata/letsencrypt/keys/letsencrypt/':'/certs':'ro,slave'
-v '/tmp/':'/transcode':'rw'
-v '/mnt/disks/ssd/.appdata/jellyfin':'/config':'rw,slave'
--device=/dev/dri
--runtime=nvidia 'lscr.io/linuxserver/jellyfin' ```
Discovered why one was not using cuda and the other was. one was trying to load subtitles and the other one wasn't. I enabled subtitles in both and now both fail.
I ran ll /dev/nvidia*
and found the following:
root@8d5ddf97fff3:/# ll /dev/nvidia*
crw-rw---- 1 root 484 195, 0 Mar 10 18:50 /dev/nvidia0
crw-rw---- 1 root 484 195, 255 Mar 10 18:50 /dev/nvidiactl
crw-rw---- 1 root 484 195, 254 Mar 10 18:50 /dev/nvidia-modeset
crw-rw-rw- 1 root root 235, 0 Mar 10 18:50 /dev/nvidia-uvm
crw-rw-rw- 1 root root 235, 1 Mar 10 18:50 /dev/nvidia-uvm-tools
/dev/nvidia-caps:
total 0
drwxr-xr-x 2 root root 80 Mar 10 19:26 ./
drwxr-xr-x 6 root root 460 Mar 10 19:26 ../
cr-------- 1 root root 238, 1 Mar 10 19:26 nvidia-cap1
cr--r--r-- 1 root root 238, 2 Mar 10 19:26 nvidia-cap2
gid 484 is the video
group in the host machine. if I do a cat /etc/group
I get:
root:x:0:
daemon:x:1:
bin:x:2:
sys:x:3:
adm:x:4:
tty:x:5:
disk:x:6:
lp:x:7:
mail:x:8:
news:x:9:
uucp:x:10:
man:x:12:
proxy:x:13:
kmem:x:15:
dialout:x:20:
fax:x:21:
voice:x:22:
cdrom:x:24:
floppy:x:25:
tape:x:26:
sudo:x:27:
audio:x:29:
dip:x:30:
www-data:x:33:
backup:x:34:
operator:x:37:
list:x:38:
irc:x:39:
src:x:40:
gnats:x:41:
shadow:x:42:
utmp:x:43:
video:x:44:jellyfin
sasl:x:45:
plugdev:x:46:
staff:x:50:
games:x:60:
users:x:100:abc
nogroup:x:65534:
crontab:x:101:
abc:x:472:
jellyfin:x:102:
From this I can see that:
video
group inside the container, but the ids do not matchNot sure if this is the problem, but looks like it. only the container's root will be able to access the gpus
root@jf:/# ll /dev/nvidia*
crw-rw-rw- 1 root root 195, 255 Mar 10 21:16 /dev/nvidiactl
crw-rw-rw- 1 root root 241, 0 Mar 10 21:16 /dev/nvidia-uvm
crw-rw-rw- 1 root root 241, 1 Mar 10 21:16 /dev/nvidia-uvm-tools
I don't believe this to be a container issue. If it was, we would have more reports of this (and I wouldn't be able to run it).
I believe there is something extra going on with either opensuse or how you've configured your users/groups with docker. Unfortunately I'm not in a position to spin up a test opensuse instance to look into this further.
Personally I would look further into if opensuse does have SELinux enabled by default and if so, try disabling it.
if I added the abc user to 484 group it was able to run the nvidia-smi. but yeah, that is not a solution that will persist. I also think it might be opensuse, will look into it and selinux. Thank you for your help so far :)
The only selinux package I have installed is libselinux1
. and from the GRUB I can see that the "security=apparmor" parameter is set, not selinux. back to square one =(
So, I tried with the official image again after realizing the problem was not loading the subtitles. And this time it also failed. But as soon as I removed the user: 472:472
from the compose file, it actually worked just fine. It seems like the official image runs jellyfin with the root user inside the container unless you set the user
. Setting the user breaks it.
I tried removing the PUID and PGID from the linuxserver's container but now the problem is it cannot read the movie files any more because of permissions. plus I really don't think it would solve anything since it is still using the abc user, just that it is no longer mapped with the PUID and PGID.
This issue has been automatically marked as stale because it has not had recent activity. This might be due to missing feedback from OP. It will be closed if no further activity occurs. Thank you for your contributions.
This issue is locked due to inactivity
Is there an existing issue for this?
Current Behavior
Executing docker-compose with non-root user and also having the PUID and PGID set to another non-root user in the service definition leads to the
abc
user inside the container not being able to access the gpu. Transcoding is not working because of this and ffmpeg returns 1 error. Both running user and PUID user belong to thevideo
group in the host machine.I verified this by doing
docker exec -it jellyfin bash
to run a shell inside the container. It logs me in as the container's root even if I didn't use sudo. if I runnvidia-smi
it correctly shows me the gpu. This means the container has access to the GPU just fine. I even ran the ffmpeg command that was failing that I grabbed from the logs and it worked. I then usedrunuser -u abc bash
to run a shell as abc. When I executenvidia-smi
I get an errorFailed to initialize NVML: Insufficient Permissions
.Expected Behavior
I expect the
abc
user to have enough permissions to access the gpu when not running docker-compose as root and when PUID and PGID are set to non-root user.Steps To Reproduce
media
andadmin
mnt/Tank/Apps/jellyfin
and chown it tomedia
usermnt/Tank/MediaCenter
and chown it tomedia
usermnt/Tank/MediaCenter
video
group to both users withusermod -aG video {user}
admin
user thedocker-compose up -d
command from the same folder with the docker-compose.yml (pasted below)docker exec -it jellyfin bash
and thenrunuser -u abc bash
followed bynvidia-smi
Environment
CPU architecture
x86-64
Docker creation
Container logs