Open nikimanoledaki opened 10 months ago
Looking at SRE Metrics
, @incertum, do you already have a Grafana dashboard for these metrics? We would need to either create Prometheus queries or access them through the Falco internal metrics.
@nikimanoledaki Falco does not yet have a Prometheus exporter, perhaps for Falco 0.38 in May we may have it, I need to check with the other maintainers. Meanwhile, we have Falco metrics as internal Falco rules that can be piped to logrotated files (JSONL formatted).
Proposing to make the CNCF SRE Metrics
independent of Falco or Falco's Metrics and report CPU and memory usages of project binaries through your preferred framework as well as creating your preferred Grafana dashboards. WDYT?
I wonder if there are any useful metrics in the default metrics of Kubernetes, for example:
It would be nice to somehow surface the internal Falco metrics that way, but I'm not sure if that would be possible since those would be logs, not metrics.
What is the filesystem location where the internal Falco metrics are exported? These metrics are at the Pod level, correct?
Which Falco Metrics would you find useful or relevant for either 1) performance monitoring or 2) setting up the benchmark tests?
Looking at this, I imagine "kernel.evt_rate"
is one that we would definitely need for the benchmark tests.
I created two deep-dive ticket on the steps to collect the metrics and visualize them. I made a distinction between Kepler and Kubernetes related metrics which have a more standard approach and Falco that needs some more thought on the process, hope that it is clear, please let me know
This issue aims to investigate the sustainability-related metrics that could be implemented as part of our reference architecture.
The WG has so far identified the following use cases that each require a slightly different set of metrics:
SRE Metrics
Metrics used by CNCF project maintainers to make improvements at the application level. For example, as mentioned by @incertum in the issue linked before: Falco's own internal metrics (CPU, memory, and counters), traditional SRE metrics (CPU/mem usage), and energy metrics.
More information about this can be found in the Metrics section of the Green Reviews design document.
Sustainability Metrics
Other emerging indices that can be used to assess an application's sustainability footprint may also be considered in the future.
Benchmark-Specific Metrics
Metrics to setup the benchmark tests for each CNCF Project.
These metrics are often inter-related. For example, data about energy consumption can be used in each of these scenarios.
This issue can be used to track the ideas and discussions for which metrics the Green Reviews pipeline should track. That being said, prioritisation is key so that the WG remains on track with the milestones that were set in the Roadmap by the group.