Overall things look good, one thing I think would make the feature more intuitive is to allow user to specify arbitrary nodename/hostname for their dynamic nodes.
For instance, I tried creating a new node: sudo scontrol create nodename=d1 cpus=8 feature=dyn state=cloud, when slurm tried resuming this node, azslurm rejected it because it the hostname didnt match pre-created hostnames. also user really has no way of knowing what these hostnames are, which again requires them to browse documentation, though i understand its "clustername-dynamic-1". (again, its less intuitive). I think its reasonable for azslurm to check the nodename against the output of sinfo to verify if the nodename is a valid one or not. since the user is required to issue scontrol create anyway. And then resume it as usual.
Overall things look good, one thing I think would make the feature more intuitive is to allow user to specify arbitrary nodename/hostname for their dynamic nodes.
For instance, I tried creating a new node:
sudo scontrol create nodename=d1 cpus=8 feature=dyn state=cloud
, when slurm tried resuming this node, azslurm rejected it because it the hostname didnt match pre-created hostnames. also user really has no way of knowing what these hostnames are, which again requires them to browse documentation, though i understand its "clustername-dynamic-1". (again, its less intuitive). I think its reasonable for azslurm to check the nodename against the output ofsinfo
to verify if the nodename is a valid one or not. since the user is required to issuescontrol create anyway
. And then resume it as usual.