LLNL / scr

SCR caches checkpoint data in storage on the compute nodes of a Linux cluster to provide a fast, scalable checkpoint / restart capability for MPI codes.
http://computing.llnl.gov/projects/scalable-checkpoint-restart-for-mpi
Other
99 stars 36 forks source link

Add basic Flux support #469

Closed ofaaland closed 2 years ago

ofaaland commented 2 years ago

Add FLUX value to SCR_RESOURCE_MANAGER cmake property Read Flux Job ID from environment in scr_env_jobid() Add FLUX subdirectory and stubs for shell scripts

Passed simple testing on elmerfudd with -DSCR_RESOURCE_MANAGER=FLUX

ofaaland commented 2 years ago

@adammoody please take a look, thanks.

adammoody commented 2 years ago

Thanks @ofaaland . Looks good to me.

Since we're about to stamp a v3.0 release, let's merge this into the develop branch after the release. In the meantime, could you create a "flux" branch from develop so we can merge this PR into the flux branch. We can then later merge flux into develop sometime after the release.

ofaaland commented 2 years ago

Hi @adammoody that makes sense. Re: the flux branch, it looks like permissions don't allow me to create a branch in the scr repo.

adammoody commented 2 years ago

Ah, ok. We may need to add you to some list. I went ahead and created a flux branch and merged your PR into that.