Open Xeonus opened 2 months ago
Any time any OZ job fails we should get an alert and someone should look into it.
We agreed to setup a new alerts channel for things that need response and aren't just info.
Channel is setup, invited both @Xeonus and @Hyferion the same bots we are using in the current (info) channel are also there.
We had now multiple instances where the gauge adder failed and no error log was forwarded to our monitoring. We also didn't receive alerting via email.
Let's fix this.
We have a monitoring config in place which needs to be reviewed