I have a dev lab running OKD 4.4 that I use for dev/test/personal/whatever purposes. All my nodes are running the latest fcos 31 build.
I've added an NVIDIA Maxwell GPU to one of my nodes through VFIO passthrough, and I was hoping this package could help me get the drivers and runtimes onto the node so I can use my GPU as a video encoder in one of my applications (I need to be able to install at least enough to support CUVID).
I took a quick look through the code, and it looks like it's only designed to recognize RHEL installations. Is there any plan for this operator to support clusters deployed on fcos?
CC @cgwalters -- I have seen you discussing FCOS kernel module management in other threads.
--
FWIW, a couple of things I noted from messing around:
attempting to deploy the default configuration of this operator's pod for nvidia-gpu on fcos-based nodes results in a very strange error message due to pkg/controller/specialresource/runtime.go#renderOperatingSystem() returning an empty string. It causes the driver container to generate an invalid name that ends in a hyphen.
Just for fun, I attempted to manually relabel my node to fool the operator's pod into thinking I was on RHEL CoreOS. That actually did work, but the subsequent installation of course failed with
+ dnf install -q -y elfutils-libelf.x86_64 elfutils-libelf-devel.x86_64
Error: Unable to find a match: elfutils-libelf-devel.x86_64
it's not the kind of error I was expecting to get, but there it is.
I have a dev lab running OKD 4.4 that I use for dev/test/personal/whatever purposes. All my nodes are running the latest fcos 31 build.
I've added an NVIDIA Maxwell GPU to one of my nodes through VFIO passthrough, and I was hoping this package could help me get the drivers and runtimes onto the node so I can use my GPU as a video encoder in one of my applications (I need to be able to install at least enough to support CUVID).
I took a quick look through the code, and it looks like it's only designed to recognize RHEL installations. Is there any plan for this operator to support clusters deployed on fcos?
CC @cgwalters -- I have seen you discussing FCOS kernel module management in other threads.
--
FWIW, a couple of things I noted from messing around:
pkg/controller/specialresource/runtime.go#renderOperatingSystem()
returning an empty string. It causes the driver container to generate an invalid name that ends in a hyphen.it's not the kind of error I was expecting to get, but there it is.