ROCm / ROCm-docker

Dockerfiles for the various software layers defined in the ROCm software platform
MIT License
422 stars 64 forks source link

[Feature]: Distributing ROCm and kernel drivers in OpenShift 4.x #125

Closed lohbe closed 3 months ago

lohbe commented 5 months ago

Suggestion Description

Given the rise and popularity of container platforms like OpenShift 4.x in enterprises, there should be plans to support driver installation beyond amdgpu-install and package-based installations. In particular, OpenShift uses RHCOS, an rpm-ostree based immutable OS that does not fully conform to Linux FHS - e.g. /var/lib/dkms is not present in the OS.

The ask here is to consider working with Red Hat as described in the Driver Toolkit documentation to enable native OpenShift + RHCOS support. This will be a significant quality-of-life improvement over the current alternative to run a separately managed RHEL node (and its associated limitations) on OpenShift.

Operating System

RHCOS

GPU

No response

ROCm Component

ROCk kernel driver, amdgpu-dkms

jiridanek commented 4 months ago

See these two projects, especially the second one

lohbe commented 3 months ago

Thanks, the 2nd one indeed looks like what I was asking for.