radiocosmology / caput

Cluster Astronomical Python Utilities
GNU General Public License v2.0
12 stars 21 forks source link

Add email notification options for batch jobs #250

Closed sjforeman closed 1 year ago

sjforeman commented 1 year ago

This adds command-line options to caput-pipeline to get slurm or PBS to email a specific address when certain job events take place (e.g. begin, fail), inheriting the options from each scheduling system. I've only tested it on slurm (cedar), but the PBS syntax seems simple enough that it should work as well...

sjforeman commented 1 year ago

After some Googling and experimentation, I can't figure out why readthedocs is failing. The commands that give the failure,

git clone --depth 1 https://github.com/radiocosmology/caput.git .
git fetch origin --force --prune --prune-tags --depth 50 pull/250/head:external-250

run fine for me on cedar and my laptop. @ljgray , any ideas?

ljgray commented 1 year ago

After some Googling and experimentation, I can't figure out why readthedocs is failing. The commands that give the failure,

git clone --depth 1 https://github.com/radiocosmology/caput.git .
git fetch origin --force --prune --prune-tags --depth 50 pull/250/head:external-250

run fine for me on cedar and my laptop. @ljgray , any ideas?

Hmm, I also can't reproduce. It's always worth just making sure the branch is rebased to master and pushing it again to see if the issue goes away, but otherwise I'll have a look to see if I can figure anything out

jrs65 commented 1 year ago

This looks good to me, though I wonder if it might have been cleaner to just require putting the options in the cluster section of the config file. It wouldn't need to be passed through as parameters in nearly so many places. Similarly, it could just embed the options in the script fed to slurm/pbs rather than adding them as CLI options.

In any case probably not worth changing now, although in future it might be nice to be able to also pull the params from the cluster section.