A curated collection of publicly available resources on how technology and tech-savvy organizations around the world practice Site Reliability Engineering (SRE)
Creative Commons Zero v1.0 Universal
9.13k
stars
780
forks
source link
Add "Scaling your on-duty team" blog post by Teads #25
The internet never sleeps, and even with the best design for resilience, one day, your system will go down.
At Teads, we deliver outstream video advertising for the biggest content publishers in the world. Any downtime has important repercussions on our revenue but also on the publisher’s revenue.
In a few years we grew from a start-up to a scale-up, although we operate globally, our tech team is mostly based in France. For this reason, we decided to carefully think about scaling our on-duty team in order to minimize the downtime when a system goes down.
Abstract of the article: