Our DPS run bash script has a try/catch to make sure logs are copied to DPS outputs during failures but has the unintentional effect of not bubbling our system exit signal so the DPS runner knows it's a failure. Now that DPS will show statuses as failures we also need a way to alert us when a job fails
Solution
[x] trap the last system exit information and bubble it in the catch block
[ ] write a new job that checks for any job failures within the last 10 minutes and is scheduled every 10 minutes
Problem
Our DPS run bash script has a try/catch to make sure logs are copied to DPS outputs during failures but has the unintentional effect of not bubbling our system exit signal so the DPS runner knows it's a failure. Now that DPS will show statuses as failures we also need a way to alert us when a job fails
Solution