Closed ltalirz closed 4 months ago
Yes, nv_peer_mem should be added to Alma image as well. This can be fixed in the March release.
I've tried this with the latest HPC AlmaLinux 8.7 image on an NDv4 VM, and nvidia_peermem is installed with the Mellanox driver and enabled.
What version of the HPC AlmaLinux 8.7 image are you using and on which SKU?
Interesting, thanks for checking!
This was using the azhop image 2023.0705.1612, which derives from almalinux:almalinux-hpc:8_7-hpc-gen2:8.7.2023060101
@xpillons According to the azhop docs, this is the latest base image for which azhop images are available. Perhaps worth looking into upgrading this?
Mentioning @matt-chan, @adam-grofe for info
@ltalirz building new azhop image right now
Closing as we're installing nvidia_peermem with the NVIDIA drivers on the current images
A customer on the Alma 8.7 image has run into
with NVSHMEM and noticed that the
nv_peer_mem
kernel is missing.See the docs https://docs.nvidia.com/nvshmem/api/faq.html
Would it make sense to add
nv_peer_memory
to the azhpc image? https://github.com/Mellanox/nv_peer_memoryAnything we would need to watch out for?
mentioning also @xpillons