Check https://github.com/ansys/pymapdl/pull/2865 for a bit of historical context, which lead to https://github.com/ansys/pymapdl/pull/3091.
In #2865 we proposed implementing PyHPS to interact with HPC clusters. While PyHPS is very powerful, it is not an scheduler, so it needs to be installed additionally to an scheduler (like SLURM) and depends on it.
In this PR we are going to support SLURM HPC clusters only and directly, without PyHPS.
Research
Check #3397 for the research done on launching MAPDL and PyMAPDL on SLURM clusters
Introduction
For the moment we are going to focus more on launching single MAPDL instances, leaving aside the MapdlPool since it does create issues when regarding resource splitting. I think comming up with a good default resource sharing scheme might be a bit tricky.
Also, we are going to focus on the most useful stuff:
[Case 1] Batch script submission (Scenario A in #2865)
[Case 2] Interactive MAPDL instance on HPC, and PyMAPDL on entrypoint (Scenario B in #2865)
[Case 3] Interactive MAPDL instance on HPC, and PyMAPDL on outside-cluster computer (Similar to scenario B in #2865)
We might need to ssh to the entrypoint pc.
[Case 4] Batch submission from ouside-cluster machine. This is tricky because attaching files is complicated. This issue is solved if we are running interactively, because PyMAPDL can take care of uploading the files to the instance. So we will leave this one to the very end.
Roadmap
Start to implement this on the following PRs:
[x] Basic reorg in the main files. Just reorg, to avoid triggering Kathy's review #3436 #3465
[ ] Add info about MAPDL running on clusters, env var etc.
[ ] [Case 1] Running PyMAPDL locally on a cluster smoothly #3466
[ ] Allow PyMAPDL to start and connect to interactive sessions
To be broken in different PRs depending on whether the client is running:
[ ] [Case 2] Client is running on the headnode #3497
[ ] [Case 3] Client is running outside the cluster.
[ ] Check if we can implement a command line interface to simplify submission of scripts or launching remote MAPDL instances. #2961
Context
Check https://github.com/ansys/pymapdl/pull/2865 for a bit of historical context, which lead to https://github.com/ansys/pymapdl/pull/3091. In #2865 we proposed implementing PyHPS to interact with HPC clusters. While PyHPS is very powerful, it is not an scheduler, so it needs to be installed additionally to an scheduler (like SLURM) and depends on it. In this PR we are going to support SLURM HPC clusters only and directly, without PyHPS.
Research
Check #3397 for the research done on launching MAPDL and PyMAPDL on SLURM clusters
Introduction
For the moment we are going to focus more on launching single MAPDL instances, leaving aside the
MapdlPool
since it does create issues when regarding resource splitting. I think comming up with a good default resource sharing scheme might be a bit tricky.Also, we are going to focus on the most useful stuff:
Roadmap
Start to implement this on the following PRs: