openpmix / prrte

PMIx Reference RunTime Environment (PRRTE)
https://pmix.org
Other
36 stars 67 forks source link

Checklist for "stable" landing point #2020

Open rhc54 opened 1 month ago

rhc54 commented 1 month ago

With the project winding down, it is time to define a stable landing point where we can leave it for those wanting to use it. This means:

We'll keep a checklist here as we work thru the process - will culminate in a new PRRTE v4 release series

Code pruning and correction

Enhancements

Scheduler integration

naughtont3 commented 1 month ago
* [ ]  Resolve "permanent" solution to the Slurm plm problem - use new launcher lib _if_ it becomes available, otherwise may need to remove envar support for the internal "srun" cmd line options

Quick follow-up after 3oct2024 teleconf, I was mistaken and the SLURM_VERSION is not exported as an envvar within the allocation. Appears you must go through one of the utilities (e.g., srun --version, scontrol show config | grep SLURM_VERSION).

shell: $ srun --version
slurm 24.05.2
shell: $ scontrol show config | grep SLURM_VERSION
SLURM_VERSION           = 24.05.2
shell: $ echo $SLURM_VERSION

shell: $
rhc54 commented 1 month ago

If you just get an allocation (salloc and no srun) is there anything you can see that might give us a hint as to version, even if it doesn't give us a direct value?

naughtont3 commented 1 month ago

Unfortunately, i do not see anything that would give an indication (salloc and then env | grep SLURM).

rhc54 commented 1 month ago

The "oob collapse" has been completed - see https://github.com/openpmix/prrte/pull/2035

edgargabriel commented 3 weeks ago

@rhc54 Is the fix of https://github.com/open-mpi/ompi/issues/12682 already in as well?

rhc54 commented 3 weeks ago

Yes - everything is caught up. I have one thing still in the queue, but it's being tracked over in the PMIx repo. Otherwise, everything still needing attention is listed above, and anything else is done. The fix you ask about is also in the release branch awaiting update over in OMPI.