truecharts / public

Community Helm Chart Repository
https://truecharts.org
GNU Affero General Public License v3.0
1.13k stars 617 forks source link

Problems with legacy GPUs on updated APPs #7955

Closed b-m-f closed 1 year ago

b-m-f commented 1 year ago

App Name

Jellyfin

SCALE Version

22.12.2

App Version

10.8.9_14.0.1

Application Events

application-evets

Application Logs

Container is not starting. No logs available

Application Configuration

Nothing changed since the version before, which runs fine. Except for the added GPU field.

Describe the bug

Using a legacy GPU (GT530) the nvidia-container-runtime will not work with TrueNAS Scale, as the legacy driver is not installed.

GPU support was disable in the Kubernetes settings, but the updated application starts loading GPU specific things, without checking for that flag.

To Reproduce

Expected Behavior

Should run and not try to load nvidia drivers. GPU should be ignored

Screenshots

/

Additional Context

/

I've read and agree with the following

PrivatePuffin commented 1 year ago

@stavros-k I think you will want to have to look into this one!

PrivatePuffin commented 1 year ago

After discussing it, priority is low as likely not related to us....

PrivatePuffin commented 1 year ago

Note: I would advice trying to add the same GPU on official Apps, if those also throw this error file an issue with iX. If those work, please report so.

b-m-f commented 1 year ago

@Ornias1993 the problem is that I am not adding the GPU to the container. But somehow the nvidia-runtime wants to load the driver.

This does not happen with official charts or Truecharts before the breaking change.

PrivatePuffin commented 1 year ago

@stavros-k Could you look into potential differences in how we and official handles the runtimeclass, in due time?

PrivatePuffin commented 1 year ago

@b-m-f Can you confirm you're trying this on the official "community" train? Because that is the train you should be trying as it has mostly the same backend in these regards, where their stable train has older codebase.

b-m-f commented 1 year ago

@Ornias1993 I tried it with sonarr & radarr from both the "community" train and Truecharts, and both deploy successfully.

It seems that this issue is jellyfin specific and I have jumped to conclusions, sorry!

PrivatePuffin commented 1 year ago

Interesting, in that case it is indeed most likely a container issue... As the way gpu's are handled is done globally.

Closing as "most likely upstream issue"

PrivatePuffin commented 1 year ago

Thanks for all the effort put into this btw!

PrivatePuffin commented 1 year ago

@all-contributors please add @b-m-f for bug

allcontributors[bot] commented 1 year ago

@Ornias1993

I've put up a pull request to add @b-m-f! :tada:

b-m-f commented 1 year ago

I have figured out what is happening.

On old release there was an environment variable set NVIDIA_VISIBLE_DEVICES=void on the pod.

In the docker image it defaults to NVIDIA_VISIBLE_DEVICES=all.

Setting NVIDIA_VISIBLE_DEVICES=void via the extra environment variables fixes the issue.

PrivatePuffin commented 1 year ago

@stavros-k i thought we already set this?

PrivatePuffin commented 1 year ago

New plex and jellyfin releases are out with this fix @b-m-f

stavros-k commented 1 year ago

@all-contributors please add @b-m-f for bug

allcontributors[bot] commented 1 year ago

@stavros-k

@b-m-f already contributed before to bug

truecharts-admin commented 1 year ago

This issue is locked to prevent necro-posting on closed issues. Please create a new issue or contact staff on discord of the problem persists