KTH / devops-course

Repository of the DevOps course at KTH Royal Institute of Technology DD2482
165 stars 343 forks source link

Monitoring, tracing, observability in DevOps #8

Open monperrus opened 6 years ago

monperrus commented 6 years ago
monperrus commented 2 years ago

Faaster troubleshooting-evaluating distributed tracing approaches for serverless applications

monperrus commented 2 years ago

Timeloops: System Call Policy Learning for Containerized Microservices.

bbaudry commented 2 years ago

Sampler is a tool for shell commands execution, visualization and alerting. Configured with a simple YAML file. https://sampler.dev/

monperrus commented 2 years ago

Stagemonitor is a Java monitoring agent that tightly integrates with time series databases like Elasticsearch, Graphite and InfluxDB to analyze graphed metrics and Kibana to analyze requests and call stacks

https://github.com/stagemonitor/stagemonitor

cc/ @gluckzhang

monperrus commented 2 years ago

Trace Server Protocol https://github.com/eclipse-cdt-cloud/trace-server-protocol

bbaudry commented 2 years ago

a sweet feature of grafana https://grafana.com/blog/2021/07/30/how-to-use-grafana-and-prometheus-to-rickroll-your-friends-or-enemies/?src=li&mdm=social

bbaudry commented 2 years ago

https://github.com/MacroPower/prometheus_video_renderer

bbaudry commented 2 years ago

Zabbix open source monitoring solution for network monitoring and application monitoring of millions of metrics. https://www.zabbix.com/

monperrus commented 2 years ago

strace is a diagnostic, debugging and instructional userspace utility for Linux. It is used to monitor and tamper with interactions between processes and the Linux kernel, which include system calls, signal deliveries, and changes of process state. https://strace.io/

monperrus commented 2 years ago

Spring Metrics https://docs.spring.io/spring-metrics/docs/current/public/prometheus

monperrus commented 2 years ago

Let's Trace It: Fine-Grained Serverless Benchmarking using Synchronous and Asynchronous Orchestrated Applications https://arxiv.org/pdf/2205.07696.pdf

monperrus commented 1 year ago

Open Tracing Tools: Overview and Critical Comparison https://arxiv.org/pdf/2207.06875.pdf

monperrus commented 1 year ago

Reliability Pillar - AWS Well-Architected Framework https://docs.aws.amazon.com/pdfs/wellarchitected/latest/reliability-pillar/wellarchitected-reliability-pillar.pdf

monperrus commented 1 year ago

Towards Solving the Challenge of Minimal Overhead Monitoring. (arXiv:2304.05688v1 [cs.SE])

monperrus commented 1 year ago

Lessons Learned Building a Global Synthetic Monitoring System Talk at SREcon https://www.usenix.org/conference/srecon22apac/presentation/sidh

monperrus commented 1 year ago

Elastic Observability https://www.elastic.co/observability

monperrus commented 1 year ago

Dynatrace https://www.dynatrace.com/solutions/application-monitoring/

monperrus commented 1 year ago

CausalRCA: Causal inference based precise fine-grained root cause localization for microservice applications (JSS 2023)

bbaudry commented 1 year ago

A curated collection of publicly available resources on how technology and tech-savvy organizations around the world practice Site Reliability Engineering (SRE) https://github.com/upgundecha/howtheysre

bbaudry commented 1 year ago

Observability with Gitlab https://opstrace.com/ https://about.gitlab.com/direction/monitor/observability/