Open incertum opened 1 month ago
CC @FedeDP @sgaist @leogr
libs_metrics_collector
static https://github.com/falcosecurity/falco/pull/3192#discussion_r1599855210sanitize_metric_name
according to the Open Metric standard was introduced in Falco. Perhaps should be pushed as well to the libs metrics collector.Great discussions happening in https://github.com/falcosecurity/falco/pull/3140, few follow up items
num_evts
is still missing in the Prometheus output since it requires greater code refactors.evt.source
? Since we can only provide one view of the metrics at a time, instead of adding nested fields or implementing another solution, Falco metrics should focus on the syscalls or the primary plugin source only. When running Falco with syscalls and one plugin, the new plugin metrics API should be used to retrieve plugin metrics in addition to the syscalls metrics.
- The metrics framework should target the primary event source only, as the metrics snapshots can realistically only expose one current view, especially for Prometheus. Plugin metrics should instead be supported via the new plugin metrics support; see new(plugin_api): add plugin metrics support libs#1828
- [ ] A consolidated and proper Falco metrics model is needed given that we now have even more outputs channels for the metrics (e.g. Prometheus)
Hey @incertum could you elaborate more on these two points?
@leogr I rewrote the text https://github.com/falcosecurity/falco/issues/3194#issuecomment-2111009270, is it more clear? happy to add more details.
Much clearer now, thank you!
Just one thought:
- Since we can only provide one view of the metrics at a time
Why? I guess this is a current limitation, but we can fix it in the future. Am I wrong? I believe that in the long run, all data sources should be first-citizen, and it shouldn't be technically impossible to accommodate this.
Much clearer now, thank you!
Just one thought:
- Since we can only provide one view of the metrics at a time
Why? I guess this is a current limitation, but we can fix it in the future. Am I wrong? I believe that in the long run, all data sources should be first-citizen, and it shouldn't be technically impossible to accommodate this.
We can emit multiple rules outputs or lines into the output file ( I would not do it though), but for Prometheus there is just one endpoint to scrape at a time ... IMO there should be more separate plugin specific metrics handling, something that was started in libs. Most metrics are syscalls source specific or generic (e.g. CPU and memory usages or rules counters) anyways. In a way right now I can only think of number of events as useful to be plugin / source specific in case you have multiple sources.
CC @sboschman (metrics for Falco w/ plugin only)
From an operational point of view I like to have the falco metrics easily integrated with our metrics platform. So, I would like to thank everyone involved with exposing the falco metrics in a Prometheus compatible way.
I am not familiar with the falco code at all, so consider the following comments more as an outside view of things, not in any way directly mapping to any part of the code.
Falco metrics:
Notes:
sum without(event_source) (events_processed_total{})
and has not to be explicitly exposed by Falcok8s_audit
event source. So (5) are plugin specific metrics, not event source specific metrics.Few more thoughts:
Motivation
Tracking pending cleanups, refactors or additional features for the Falco internal metrics framework https://falco.org/docs/metrics/