tteck / Proxmox

Proxmox VE Helper-Scripts
https://Helper-Scripts.com
MIT License
13.41k stars 2.08k forks source link

GID mismatch between LXC container and host causing inability to enable hardware acceleration. #2523

Closed d1scolor closed 7 months ago

d1scolor commented 7 months ago

Please verify that you have read and understood the guidelines.

yes

A clear and concise description of the issue.

The group owner of the /dev/dri/renderD128 device is set to ssl-cert in the jellyfin LXC container created by the jellyfin-install.sh script. This is likely due to on my host pve system (v8.1), the GID of render is 104, while in the LXC (using ubuntu 22.04 template), 104 is the GID of ssl-cert.

Which Linux distribution are you employing?

Ubuntu 22.04

If relevant, including screenshots or a code block can be helpful in clarifying the issue.

No response

Please provide detailed steps to reproduce the issue.

Install jellyfin using jellyfin-install.sh on pve 8.1 from shell. Configure the jellyfin instance and enable hardware acceleration in jellyfin admin panel. Play a video with lower bitrate to trigger hw accel. Get error message "Playback Error This client isn't compatible with the media and the server isn't sending a compatible media format." Go to shell of the LXC container:

ls -l /dev/dri
total 0
drwxr-xr-x 2 root root         80 Feb 21 10:11 by-path
crw-rw---- 1 root video    226,   0 Feb 21 10:11 card0
crw-rw---- 1 root **ssl-cert** 226, 128 Feb 21 10:11 renderD128

cat /etc/group | grep "ssl-cert"
ssl-cert:x:104:

Go to shell of the host pve:

ls -l /dev/dri
total 0
drwxr-xr-x 2 root root         80 Feb 21 10:11 by-path
crw-rw---- 1 root video  226,   0 Feb 21 10:11 card0
crw-rw---- 1 root render 226, 128 Feb 21 10:11 renderD128

cat /etc/group | grep "render"
render:x:104:
d1scolor commented 7 months ago

Oh wait I just found this has been discussed in https://github.com/tteck/Proxmox/discussions/1056 Is it possible to update the script to fix this? Or is this due to pve's behaviour and we have to use something like https://github.com/tteck/Proxmox/discussions/1056#discussioncomment-8070169 to fix it?

tteck commented 7 months ago

Execute the command below in the Jellyfin LXC console, then reboot.

sed -i '/^render:x:108:root,jellyfin$/d; s/^ssl-cert:x:104:$/render:x:104:root,jellyfin/' /etc/group
tteck commented 7 months ago

Can you confirm this resolves the issue?

d1scolor commented 7 months ago

I'm using cron to do a chgrp render /dev/dri/renderD128 upon start up which solved the issue. But I think your proposal should work too - not sure if the ssl-cert group is used anywhere though.

tteck commented 7 months ago

If someone would please test this to see if it resolves their issue.

Jellyfin Ubuntu 22.04

sed -i -e 's/^ssl-cert:x:104:$/render:x:104:root,jellyfin/' -e 's/^render:x:108:root,jellyfin$/ssl-cert:x:108:/' /etc/group

Jellyfin Debian 12

sed -i -e 's/^sgx:x:104:$/render:x:104:root,jellyfin/' -e 's/^render:x:106:root,jellyfin$/sgx:x:106:/' /etc/group

Plex Ubuntu 22.04

sed -i -e 's/^ssl-cert:x:104:plex$/render:x:104:root,plex/' -e 's/^render:x:108:root$/ssl-cert:x:108:plex/' /etc/group

Plex Debian 12

sed -i -e 's/^sgx:x:104:plex$/render:x:104:root,plex/' -e 's/^render:x:106:root$/sgx:x:106:plex/' /etc/group

Emby Ubuntu 22.04

sed -i -e 's/^ssl-cert:x:104:$/render:x:104:root,emby/' -e 's/^render:x:108:root,emby$/ssl-cert:x:108:/' /etc/group

Tdarr - Scripted - Channels - Unmanic Debian 12

sed -i -e 's/^sgx:x:104:$/render:x:104:root/' -e 's/^render:x:106:root$/sgx:x:106:/' /etc/group
d1scolor commented 7 months ago

Can confirm this worked on my other test setup also on pve 8.1 -

Jellyfin Ubuntu 22.04 sed -i '/^render:x:108:root,jellyfin$/d; s/^ssl-cert:x:104:$/render:x:104:root,jellyfin/' /etc/group

joaomajesus commented 7 months ago

Hi, first of all thanks for this solution! It fixed my jellyfin installation! Now I would need the same fot Tdarr. The thing is that on my Tdarr LXC the renderD128 is assigned to the group sgx:

root@tdarr:~# dir -l -a /dev/dri
total 0
drwxr-xr-x 3 root video      100 Jan 25 21:23 .
drwxr-xr-x 9 root root       640 Feb 21 06:28 ..
drw-rw---- 2 root root        80 Jan 25 21:23 by-path
crw-rw---- 1 root video 226,   0 Jan 25 21:23 card0
crw-rw---- 1 root sgx   226, 128 Jan 25 21:23 renderD128

I tried the above sed command for Debian replacing the plex user with tdarr but the group still shows sgx:

sed -i -e 's/^sgx:x:104:tdarr$/render:x:104:root,tdarr/' -e 's/^render:x:106:root$/sgx:x:106:tdarr/' /etc/group

Thanks.

tteck commented 7 months ago

Tdarr Debian 12

sed -i -e 's/^sgx:x:104:$/render:x:104:root/' -e 's/^render:x:106:root$/sgx:x:106:/' /etc/group
joaomajesus commented 7 months ago

Tdarr Debian 12

sed -i -e 's/^sgx:x:104:$/render:x:104:root/' -e 's/^render:x:106:root$/sgx:x:106:/' /etc/group

thank you so much!! 💯

AudriusTGo commented 7 months ago

no difference for me. Plex LXC. No HW after running this.

tteck commented 7 months ago

Are you a Plex Pass member?

AudriusTGo commented 7 months ago

Are you a Plex Pass member?

yes. The only thing which helps - running this after every restart: chown root:render /dev/dri/renderD128 -R chmod 0777 /dev/dri/* chown root:render /dev/dri/by-path -R

tteck commented 7 months ago

The sed command should make that useless

AudriusTGo commented 7 months ago

maybe sed command isnt working as my LXC is on 20.04?:

root@plex:~# cat /etc/os-release
NAME="Ubuntu"
VERSION="20.04.6 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.6 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal
tteck commented 7 months ago

yep, that would be the reason.

If you want, show the output of cat /etc/group and I can create a sed command for 20.04

AudriusTGo commented 7 months ago

Is it possible to update ubuntu on LXC?

root:x:0:
daemon:x:1:
bin:x:2:
sys:x:3:
adm:x:4:syslog
tty:x:5:syslog
disk:x:6:
lp:x:7:
mail:x:8:
news:x:9:
uucp:x:10:
man:x:12:
proxy:x:13:
kmem:x:15:
dialout:x:20:
fax:x:21:
voice:x:22:
cdrom:x:24:
floppy:x:25:
tape:x:26:
sudo:x:27:
audio:x:29:
dip:x:30:
www-data:x:33:
backup:x:34:
operator:x:37:
list:x:38:
irc:x:39:
src:x:40:
gnats:x:41:
shadow:x:42:
utmp:x:43:
video:x:44:plex
sasl:x:45:
plugdev:x:46:
staff:x:50:
games:x:60:
users:x:100:
nogroup:x:65534:
crontab:x:101:
messagebus:x:102:
syslog:x:103:plex
ssl-cert:x:104:
input:x:105:
kvm:x:106:
render:x:107:
postfix:x:108:
postdrop:x:109:
ssh:x:110:
systemd-journal:x:111:
systemd-network:x:112:
systemd-resolve:x:113:
systemd-timesync:x:114:
uuidd:x:115:
tcpdump:x:116:
systemd-coredump:x:999:
plex:x:998:
tteck commented 7 months ago

Try

sed -i -e 's/^ssl-cert:x:104:$/render:x:104:root,plex/' -e 's/^render:x:107:$/ssl-cert:x:107:plex/' /etc/group
AudriusTGo commented 7 months ago

now, looks ok. Thanks for help! But should i update Ubuntu on LXC?

tteck commented 7 months ago

(Don't fix what's not broken) I would just create a new Plex LXC.

kylehase commented 7 months ago

I ran sed -i -e 's/^ssl-cert:x:104:$/render:x:104:root,jellyfin/' -e 's/^render:x:108:root,jellyfin$/ssl-cert:x:108:/' /etc/group which seems to have worked but now get an error when trying to update Jellyfin using the installation/update script.

bash -c "$(wget -qLO - https://github.com/tteck/Proxmox/raw/main/ct/jellyfin.sh)"

 - Updating Jellyfin LXC  \
[ERROR] in line 61: exit code 0: while executing command apt-get -y upgrade &> /dev/null

root@jellyfin:~# apt-get -y upgrade
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
Calculating upgrade... Done
The following packages will be upgraded:
  less libssl3 libuv1 libxml2 openssl tcpdump tzdata
7 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
Need to get 0 B/4939 kB of archives.
After this operation, 7168 B of additional disk space will be used.
Preconfiguring packages ...
dpkg: unrecoverable fatal error, aborting:
 unknown system group 'ssl-cert' in statoverride file; the system group got removed
before the override, which is most probably a packaging bug, to recover you
can remove the override manually with dpkg-statoverride
E: Sub-process /usr/bin/dpkg returned an error code (2)

Edit: Group 108 in /etc/group was not changed by the sed command provided on my container. The line still started with render:x:108, so I manually changed it to ssl-cert:x:108: which resolved the issue. I didn't think to check the existing line before fixing it but I believe it was render:x:108:jellyfin (without root) which would explain why sed missed it.

tristan-k commented 5 months ago

The issue isnt fixed through the sed command because on my proxmox system the rendergroup is 103 and therefore it is not suffcient to hard code the group number. It has to be changed dynamically.

cat /etc/group | grep "render"
render:x:103:root
tteck commented 5 months ago

@tristan-k what Linux distribution are you using?

tristan-k commented 5 months ago

@tristan-k what Linux distribution are you using?

$ lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description:    Debian GNU/Linux 11 (bullseye)
Release:    11
Codename:   bullseye

$ pveversion
pve-manager/7.4-17/513c62be (running kernel: 5.15.83-1-pve)
tteck commented 5 months ago

Upgrade to Proxmox 8 and use the scripts default settings.