Open wwarriner opened 1 year ago
If you create a reservation for the purposes of research workflow facilitation, you will still encounter QoS and job time limit restrictions. If these are barriers, a temporary partition will need to also be created. The ops team will need to perform the node/partition related steps below.
A sample partition definition for posterity
PartitionName=$name Default=NO MinNodes=1 MaxNodes=5 MaxTime=6-06:00:00 DefaultTime=01:00:00 AllowGroups=ALL DisableRootJobs=NO RootOnly=NO Hidden=NO Shared=NO GraceTime=0 PreemptMode=OFF ReqResv=NO AllowAccounts=ALL AllowQos=ALL LLN=NO ExclusiveUser=NO PriorityJobFactor=1 PriorityTier=8 OverSubscribe=NO State=UP Nodes=c[0232-0235]
You can force access to the partition to require reservations using ReqResv=yes
. This allows dynamic limitation of who is authorized to use the partition.
Creating a reservation
You should be able to create a reservation, I believe Ops, Dev and DataSci have this authority, example 30 day res for c0220-c0223 for 3 users staring
now
Using a reservation (for researchers)
And I also noticed we need to use
--reservation=$resv-name
to make use of it withsbatch
andsrun
.Updating a reservation
Adding nodes to a reservation
If you do add new nodes, you’ll have to delete the reservation and recreate. To avoid jobs jumping on the nodes in the short time between delete and create, you should first drain all of the reservation nodes (make sure to update the node list in all of the commands)
Canceling/ending reservation