Open bprashanth opened 2 months ago
We want to view the trend on these metrics, not just a point in time. For eg: I want to know # of ooms in the last week, and when exactly they happened, to match them up with other system logs and debug what's using the most ram.
2 levels of metrics: 1 captured through code instrumetation, and the other captured via AWS / vm resource usage. We need a way to track both these, and receive email/sms (whatever works) when certain preconfigured thresholds are crossed (eg: latency > 1s, ram > 90% used).