chapel-lang / chapel

a Productive Parallel Programming Language
https://chapel-lang.org
Other
1.8k stars 421 forks source link

`CHPL_GPU_ARCH`: should we build our runtime for it? Can we deduce it from CPE modules? #23960

Open e-kayrakli opened 11 months ago

e-kayrakli commented 11 months ago

https://github.com/chapel-lang/chapel/pull/23950 will start building our runtimes based on CHPL_GPU_ARCH for AMD. https://github.com/chapel-lang/chapel/pull/23789 made a similar change for NVIDIA, but relied on compatibility to list multiple (relatively old) architectures.

But, maybe, NVIDIA should have also used the single architecture that's set either automatically or by the user via CHPL_GPU_ARCH. In the past, we had some hesitations about building our runtime binary based on this to keep the runtime more portable. This was also brought up in the context of Cray modules we build for Chapel, for the future when we make them GPU-enabled.

Looking into this a bit more, the worry about Cray modules can be mitigated by the fact that Cray modules typically contain things like craype-accel-gfx90a (where gfx90a is an AMD architecture. MI250x has this architecture). Further, in Frontier documentation, loading these modules are recommended for using AMD GPUs. In an internal system, I also see similar modules for NVIDIA compute capabilities.

So, should we stop worrying about multiple architectures (https://github.com/chapel-lang/chapel/issues/22783, can enable binary portability, allow using integrated GPUs)? Should we also try to deduce this environment from craype modules if they exist?

CHPL_TARGET_CPU is in a similar position here. Investigating chpl_cpu.py can help figure out how to do module-based deduction for the architecture name.

Related CPU issue: https://github.com/Cray/chapel-private/issues/2966

bradcray commented 11 months ago

I definitely like the notion of having the CHPL_GPU_ARCH get inferred from craype modules, when available.