A. Labels added that contain the cluster name and (vm) hostname in each log. They start as placeholders in the cloud ops configuration but will be update at boot time by setup.py
B. The logging path added to the logging structure (as another label) the overall log names are generalized, since there’s no longer a need to have 6-7 different logNames it was shortened only 3, one for slurm daemons (which will be called slurm_daemon) and another for everything else slurm (thinking of putting this under the umbrella of just slurm to match legacy) and a third for the setup logs.
This allows the use of commands that can be filtered to hostname or cluster level with ease.
How was this tested?
Deployed a VM instance and added the code change to the cluster directly.
a. Ensured code does not run if user does not have cloud-ops-agent running as a service
Manually changed the cloud-ops configuration to reflect proposed changes.
b. Verify configuration is valid and placeholders show as expected
Run a script changes as a single python function.
c. Verify the function edits the configuration and restarts as expected, with updated label values
As seen above testing was by enlarge manual, the script changes can only be added along with the configuration changes.
The main changes:
A. Labels added that contain the cluster name and (vm) hostname in each log. They start as placeholders in the cloud ops configuration but will be update at boot time by setup.py
B. The logging path added to the logging structure (as another label) the overall log names are generalized, since there’s no longer a need to have 6-7 different logNames it was shortened only 3, one for slurm daemons (which will be called slurm_daemon) and another for everything else slurm (thinking of putting this under the umbrella of just slurm to match legacy) and a third for the setup logs.
This allows the use of commands that can be filtered to hostname or cluster level with ease.
How was this tested? Deployed a VM instance and added the code change to the cluster directly. a. Ensured code does not run if user does not have cloud-ops-agent running as a service
Manually changed the cloud-ops configuration to reflect proposed changes. b. Verify configuration is valid and placeholders show as expected
Run a script changes as a single python function. c. Verify the function edits the configuration and restarts as expected, with updated label values
As seen above testing was by enlarge manual, the script changes can only be added along with the configuration changes.