But, maybe, NVIDIA should have also used the single architecture that's set either automatically or by the user via CHPL_GPU_ARCH. In the past, we had some hesitations about building our runtime binary based on this to keep the runtime more portable. This was also brought up in the context of Cray modules we build for Chapel, for the future when we make them GPU-enabled.
Looking into this a bit more, the worry about Cray modules can be mitigated by the fact that Cray modules typically contain things like craype-accel-gfx90a (where gfx90a is an AMD architecture. MI250x has this architecture). Further, in Frontier documentation, loading these modules are recommended for using AMD GPUs. In an internal system, I also see similar modules for NVIDIA compute capabilities.
So, should we stop worrying about multiple architectures (https://github.com/chapel-lang/chapel/issues/22783, can enable binary portability, allow using integrated GPUs)? Should we also try to deduce this environment from craype modules if they exist?
CHPL_TARGET_CPU is in a similar position here. Investigating chpl_cpu.py can help figure out how to do module-based deduction for the architecture name.
https://github.com/chapel-lang/chapel/pull/23950 will start building our runtimes based on
CHPL_GPU_ARCH
for AMD. https://github.com/chapel-lang/chapel/pull/23789 made a similar change for NVIDIA, but relied on compatibility to list multiple (relatively old) architectures.But, maybe, NVIDIA should have also used the single architecture that's set either automatically or by the user via
CHPL_GPU_ARCH
. In the past, we had some hesitations about building our runtime binary based on this to keep the runtime more portable. This was also brought up in the context of Cray modules we build for Chapel, for the future when we make them GPU-enabled.Looking into this a bit more, the worry about Cray modules can be mitigated by the fact that Cray modules typically contain things like
craype-accel-gfx90a
(wheregfx90a
is an AMD architecture. MI250x has this architecture). Further, in Frontier documentation, loading these modules are recommended for using AMD GPUs. In an internal system, I also see similar modules for NVIDIA compute capabilities.So, should we stop worrying about multiple architectures (https://github.com/chapel-lang/chapel/issues/22783, can enable binary portability, allow using integrated GPUs)? Should we also try to deduce this environment from
craype
modules if they exist?CHPL_TARGET_CPU
is in a similar position here. Investigatingchpl_cpu.py
can help figure out how to do module-based deduction for the architecture name.Related CPU issue: https://github.com/Cray/chapel-private/issues/2966