Replace old Prometheus stack with LGTM stack from Grafana.
This deploys the following new services in the monitoring namespace:
Mimir
Loki
Tempo (optional)
Grafana
These services provide enhanced observability into Gen3 deployments, and store their data in S3.
Details:
Loki:
Loki is a log aggregation system designed to store and query logs efficiently.
It indexes only the metadata of logs, reducing storage costs and speeding up queries.
Integrated with Grafana for easy visualization and dashboard creation.
Mimir:
Mimir is a scalable and highly available time-series database compatible with Prometheus.
It provides advanced query capabilities and high availability for time-series data.
Supports long-term storage and efficient data retrieval.
Tempo (optional):
Tempo is a distributed tracing backend that stores and queries trace data.
Helps in tracking requests as they flow through various services, aiding in performance monitoring and troubleshooting.
Optional deployment, but integrates seamlessly with Grafana for trace visualization.
Grafana:
Grafana is a multi-platform open-source analytics and interactive visualization web application.
Provides a unified interface for visualizing data from Loki, Mimir, and Tempo.
Supports the creation of detailed, customizable dashboards for monitoring and alerting.
These additions will significantly improve the observability and monitoring capabilities of Gen3 deployments, ensuring better performance insights and easier troubleshooting.
Link to JIRA ticket if there is one: https://ctds-planx.atlassian.net/browse/GPE-1272
New Features
Replace old Prometheus stack with LGTM stack from Grafana.
This deploys the following new services in the monitoring namespace:
These services provide enhanced observability into Gen3 deployments, and store their data in S3.
Details:
Loki:
Mimir:
Tempo (optional):
Grafana:
These additions will significantly improve the observability and monitoring capabilities of Gen3 deployments, ensuring better performance insights and easier troubleshooting.
Breaking Changes
Bug Fixes
Improvements
Dependency updates
Deployment changes