At present, there is no distinction between job failures and cancellations. Therefore, we should seek a way to differentiate between these two statuses.
If a job is cancelled it usually makes sense to resubmit it, but failures may mean the job can't proceed without changes
May be able to detect by catching all exceptions at the top level of the tool calculation script and logging a key word, which will end up in the slurm logs
At present, there is no distinction between job failures and cancellations. Therefore, we should seek a way to differentiate between these two statuses.
https://github.com/kbase/collections/blob/b07c7e04ae4a7ac4f0c1f99d60175f60fd909d6d/src/loaders/jobs/taskfarmer/taskfarmer_task_mgr.py#LL120C11-L120C11