Lessons from Building Observability Tools at Netflix
Summary
This article introduce why Netflix build centralized observability tools and how. As Netflix business grows, scalability is very critical. Also because they adapted mircoservices architecture, it is getting harder and harder to troubleshoot issues and find a root cause when error happens. Because of that, they built many observerability tools to keep most of engineers going on innovating rapidly and bringing to their customers.
For scalable logs, they built Mantis which is a real-time streaming processing platform. They also built a tool for distributed request tracing. It serves insights on microservices environment by providing additional contextual data based on constituent traces. They have Atlas for monitoring metrics and analyzing it. They use Cassandra, Elasticsearch and Hive as data persistent storages. Finally, they provide user interfaces which is customized views for each group what they want to see.
Title
Summary
This article introduce why Netflix build centralized observability tools and how. As Netflix business grows, scalability is very critical. Also because they adapted mircoservices architecture, it is getting harder and harder to troubleshoot issues and find a root cause when error happens. Because of that, they built many observerability tools to keep most of engineers going on innovating rapidly and bringing to their customers. For scalable logs, they built Mantis which is a real-time streaming processing platform. They also built a tool for distributed request tracing. It serves insights on microservices environment by providing additional contextual data based on constituent traces. They have Atlas for monitoring metrics and analyzing it. They use Cassandra, Elasticsearch and Hive as data persistent storages. Finally, they provide user interfaces which is customized views for each group what they want to see.
Reference