SCR caches checkpoint data in storage on the compute nodes of a Linux cluster to provide a fast, scalable checkpoint / restart capability for MPI codes.
When waiting on an async flush to finish, SCR polls for some time to avoid busy spinning on the CPU, which may fight with the processes that are conducting the flush. This adds a new SCR_FLUSH_ASYNC_USLEEP parameter that allows the user to set the the sleep time. It also lowers the default value from 10 seconds to 1 millisecond.
When waiting on an async flush to finish, SCR polls for some time to avoid busy spinning on the CPU, which may fight with the processes that are conducting the flush. This adds a new
SCR_FLUSH_ASYNC_USLEEP
parameter that allows the user to set the the sleep time. It also lowers the default value from 10 seconds to 1 millisecond.