stackhpc / ansible-role-openhpc

Ansible role for OpenHPC
Apache License 2.0
45 stars 15 forks source link

Fix jobcompletion logfile existance #103

Closed sjpb closed 3 years ago

sjpb commented 3 years ago

Setting openhpc_slurm_job_comp_type (slurm.conf parameter JobCompType) to jobcomp/filetxt enables job completion records, which can be viewed using sacct -c. This is much more limited than full accounting info, but could be useful now Slurm 20.11 doesn't support filetxt accounting storage and enabling accounting requires deploying/configuring mysql + slurmdbd (hence b3b4f44 disabling accounting by default).

This PR fixes the runtime play so that the job completion logfile (role var openhpc_slurm_job_comp_loc, slurm.conf parameter JobCompLoc) is writable by user slurm. Without it slurmctld fails on startup with a message like:

error: open <JobCompLoc>: Permission denied

It also adds molecule/test12 for this using openhpc_slurm_job_comp_type: jobcomp/filetxt and role default openhpc_slurm_job_comp_loc, and updates the README to clarify why you might want to use this.

Closes #102 which contains lots of info re. slurm behavior, probably mostly not relevant here.