SCR caches checkpoint data in storage on the compute nodes of a Linux cluster to provide a fast, scalable checkpoint / restart capability for MPI codes.
This changes the behavior of make check to add the option --stop-on-failure. It also adds a 'check allocation' test/script before any of the parallel tests are run. The idea is that this script will check that users have a proper allocation before running the tests and prevent the very slow running of multiple 'srun' commands outside of an allocation. It only prevents users from running if they use make check and will not stop all the parallel tests from running if the users use make test... but at least it will print out an error message the users can stare at while their srun is queueing.
Right now only the fact that the script is in allocation is checked (that is, that a job id environment variable is set).
Things to add:
check that the right number of nodes are allocated
Related to #320
This changes the behavior of
make check
to add the option--stop-on-failure
. It also adds a 'check allocation' test/script before any of the parallel tests are run. The idea is that this script will check that users have a proper allocation before running the tests and prevent the very slow running of multiple 'srun' commands outside of an allocation. It only prevents users from running if they usemake check
and will not stop all the parallel tests from running if the users usemake test
... but at least it will print out an error message the users can stare at while their srun is queueing.Right now only the fact that the script is in allocation is checked (that is, that a job id environment variable is set).
Things to add: