livepeer / stream-tester

Stream tester is a tool to measure performance and stability of Livepeer transcoding network
23 stars 11 forks source link

[VID-482] Continous record tester - decrese max fails from 5 to 3 #351

Closed pwilczynskiclearcode closed 11 months ago

pwilczynskiclearcode commented 11 months ago

Reduce number of retries from 5 to 3 so we don't let repeating issues skip our monitoring.

I'd generally suggest to rewrite record-tester so it publishes prometheus metrics instead of PagerDuty alerts. Alerts would be based on grafana metrics. Retrials could produce a metric that would be observed… or record-tester would fail immediately and retry entire test. Then alerts could be configured to allow short, single failures of record-tester but repeating ones would raise an actual PagerDuty alert. This is just a thought, it's not in the scope of this PR/task.

linear[bot] commented 11 months ago

VID-482 Reduce record-tester's internal "max fails" from 5 to 3