I made a small change to allow the administrator to control if whether they want to try sending a mail if there is a failure.
In my case, sometimes my users either miss-typed their emails, or they copied the job template that had an example/invalid address. This caused files to remain in /var/spool/slurm-mail forever and the program would retry sending every minute, eventually getting my cluster banned from the mail server. Setting retryOnFailure to 0 always deletes the mail files after an attempted send.
I am catching all smtplib exceptions (SMTPHeloError, SMTPRecipientsRefused, SMTPSenderRefused, SMTPNotSupportedError) but perhaps SMTPHeloError shouldn't be caught, since it most probably represents a connection error rather than an invalid recipient.
Hello!
I made a small change to allow the administrator to control if whether they want to try sending a mail if there is a failure.
In my case, sometimes my users either miss-typed their emails, or they copied the job template that had an example/invalid address. This caused files to remain in
/var/spool/slurm-mail
forever and the program would retry sending every minute, eventually getting my cluster banned from the mail server. SettingretryOnFailure
to 0 always deletes the mail files after an attempted send.I am catching all smtplib exceptions
(SMTPHeloError, SMTPRecipientsRefused, SMTPSenderRefused, SMTPNotSupportedError)
but perhapsSMTPHeloError
shouldn't be caught, since it most probably represents a connection error rather than an invalid recipient.Thank you for your work on this very useful tool!