dastergon / awesome-sre

A curated list of Site Reliability and Production Engineering resources.
https://sre.xyz
Creative Commons Zero v1.0 Universal
11.99k stars 1.58k forks source link
alerting availability awesome awesome-list capacity-planning devops incident-response list monitoring on-call post-mortem postmortem production reliability reliability-engineering scalability service-level-agreement site-reliability site-reliability-engineering sre

Awesome Site Reliability Engineering Awesome

A curated list of awesome Site Reliability and Production Engineering resources.

What is Site Reliability Engineering?

"Fundamentally, it's what happens when you ask a software engineer to design an operations function." - Ben Treynor Sloss, VP Google Engineering, founder of Google SRE

Contributing

Please take a look at the contribution guidelines first. Contributions are always welcome!

Contents

Culture

Education

Books

Hiring

Reliability

Monitoring & Observability & Alerting

On-Call

Post-Mortem

Capacity Planning

Service Level Agreement

Performance

Programming

Misc Articles

Real-time Messaging

Blogs

Newsletters

Conferences & Meetups

Twitter

SRE Tools

Podcasts