Open shemi-plgs opened 2 years ago
Hi there!
RHEL 8.x is not a supported or tested platform for this role. Please see: https://github.com/NVIDIA/ansible-role-nvidia-docker/blob/c5cb5cbfec7739f4ac2c0a4c9737202662e2ea04/meta/main.yml
That said, we would probably be open to a PR to add the necessary logic for this support.
When installing the nvidia-container-runtime with this rôle, i stilled had an issue and couldn't launch any GPU tasks, having the error:
"Error response from daemon: OCI runtime create failed"
I had used the ansible role on Ubuntu and it worked fine, but on RHEL8.4, i was always having an error after install
After investigating, i found than on Ubuntu, the installation of the
nvidia-container-runtime
package comes with thenvidia-container-toolkit
dependency, however on RHEL is does not. It is this executable that is used by container runtime platforms to initiate GPU tasksThis dependency is also a dependency of the
nvidia-docker2
package, but in your rôle you only get the script.I was able to make everything work by installing the missing
nvidia-container-toolkit
with yumIs this missing dependency on RedHat platforms normal ?