Open codeknight03 opened 2 months ago
In production, I have made this change,
if [ $GPU_PRESENT -eq 0 ] && [ $GPU_CONTAINER_PRESENT -lt 0 ]; then
So for head node and login node the container CLI is not installed but it is installed on the worker nodes where nvidia-smi is present.
The pyxis post install script is not installing Nvidia Container CLI in any case:
https://github.com/aws-samples/aws-parallelcluster-post-install-scripts/blob/main/pyxis/postinstall.sh#L45-L47
Due to code line,
It checks if $GPU_CONTAINER_PRESENT > 1 and then installs which is the case if nvidia-smi is not available but if nvidia-smi is available but container cli is not then installation does not take place.