falcosecurity / falco

Cloud Native Runtime Security
https://falco.org
Apache License 2.0
7.09k stars 876 forks source link

[DISCUSSION] deprecate / remove old internal drop stats? #2910

Open incertum opened 7 months ago

incertum commented 7 months ago

@Andreagit97 opening a dedicated issue to discuss https://github.com/falcosecurity/libs/pull/1433#issuecomment-1807694658

I saw that in Falco (and I imagine also in other consumers) we use the get_capture_stats method to obtain the number of drops/events. On the other hand, with stats_v2 we are using an agnostic approach, where the final consumer receives a vector of metrics already populated by sinsp. My question here is, do we want scap_stats_v2 to replace the old scap_stats? If yes, how do we obtain the specific number of drops/events from this agnostic approach? Do we want to keep these specific numbers or the final goal is to expose a set of metrics with a Prometheus endpoint?

First and foremost we are talking about https://github.com/falcosecurity/falco/blob/master/userspace/falco/event_drops.cpp aka Falco internal: syscall event drop that we will call "old drop stats" versus the new metrics Falco option that is also capable of creating an internal rule Falco internal: metrics snapshot containing not just the drop counters but also more metrics.

@leogr starting to summarize a few shortcomings of the old stats from my perspective. At the same time I would be honoring that some adopters prefer to keep the old stats around for longer. Therfore I would be fine keeping it, but also willing to help work out a transition plan.

Cons old drop stats:

Pros old drop stats:

Andreagit97 commented 7 months ago

yeah we need to explore a little bit all the usages of these old drops stats in Falco, to understand if we really need them or if we can just replace them with the new ones

incertum commented 6 months ago

When I asked on slack no one seems to be urgently still needing this.

In the last 3+ debugging sessions I have been involved, we always found the newer metrics feature to provide more actionable insights.

Proposing to introduce a deprecation warning for Falco 0.38 or Falco 0.37 and then follow the formal deprecation cycle? WDYT @falcosecurity/falco-maintainers?

Besides the pros and cons I listed above it will help communicate easier to follow debugging steps and reduce the config surface, effectively making space for new configs that will move the needle in terms of improving Falco's performance and capabilities.

poiana commented 3 months ago

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

Andreagit97 commented 3 months ago

/remove-lifecycle stale

leogr commented 3 months ago

/assign

added to my backlog :angel:

Tentatively for /milestone 0.38.0