Open ChristianKniep opened 2 years ago
Hi Christian,
Unfortunately, ParallelCluster does not support multiArch clusters. Could you describe your use case in detail?
Thank you, Hanwen
In the EDA world, many applications support both architectures. Arm64 is lower cost/performance, but not all applications support it. In that use case both architectures are required. For capacity reasons it also makes more instances available for large workloads.
I don't know about @ChristianKniep's use case but in ours we have a mix of GPU/CPU queues and we wanted to try the C7g instances for some of our CPU workloads (as they are cheaper and supposed to be faster).
However, since ParallelCluster does not support multiArch clusters, we also need to migrate our GPU queue, which currently setup to use a g4dn.2xlarge
, to use a g5g.4xlarge
instance type. This however is not viable for us as most of our workloads are GPU driven and moving from G4dn -> G5g represents ~ 10% increase in costs.
As discussed via email with @demartinofra and Austin; I'd like to create a cluster with X86 and ARM compute nodes. In my case with a x86 headnode.
AFAIU this is currently not possible since the headnode exports
/opt/slurm
to all compute nodes.Thus, the compute nodes use the x86 binaries under
/opt/slurm/bin
and will segfault on any slurm commands.