Closed antikilahdjs closed 3 months ago
ei thank you for reporting!
Then the POD memory increased to 38gb or more and the cores either, so I would like to know it is a bug or not.
Uhm it seems like a bug, we need to investigate more on this!
Thank you so much @Andreagit97. I will send below a screenshoot from real query in Prometheus. I included a resources limits to 42gb but if remove those limits it will be reach out more than 120gb
Start the auditing and in 3 minutes the memory reach out 22gb
Thank you for the additional data, right now we are a little bit busy but we will come to it after the falco release!
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Provide feedback via https://github.com/falcosecurity/community.
/lifecycle stale
Not fixed
/remove-lifecycle stale
You increased max eventsize to 134Gb and max webhook batch size to 268Gb? In which case the memory usage is sort of expected I guess, as up to 268GB of json has to be processed at once...
A few things you might experiment with:
--audit-webhook-batch-max-size
flag on your api server, you might need multiple falco instances to keep up with your audit event stream, as you mention having a large clusteraudit-policy.yaml
(docs) in case you are not already doing so, as the api server can generate massive amounts of audit events which are not all relevant to falcorequestObject
(e.g. a ConfigMap), you might be able to find the event which includes some massive k8s object and consider dropping it using the audit-policy.yaml
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Provide feedback via https://github.com/falcosecurity/community.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten
.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close
.
Provide feedback via https://github.com/falcosecurity/community.
/lifecycle rotten
Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen
.
Mark the issue as fresh with /remove-lifecycle rotten
.
Provide feedback via https://github.com/falcosecurity/community. /close
@poiana: Closing this issue.
Describe the bug
How to reproduce it
Expected behaviour
In my lab everything works perfecly because I dont have a large environment but in my production I am facing the error about the body is too large then I had increased the 2 parameters to works correctly
Then the POD memory increased to 38gb or more and the cores either, so I would like to know it is a bug or not.
My environment is too large but is so weird because I tested other applications and works around 12gb.
I would like to fix the error or if I did something wrong please help me on it.
Screenshots
Environment
Falco version:
Thu Sep 14 15:17:25 2023: Falco version: 0.35.1 (x86_64) Thu Sep 14 15:17:25 2023: Falco initialized with configuration file: /etc/falco/falco.yaml {"default_driver_version":"5.0.1+driver","driver_api_version":"4.0.0","driver_schema_version":"2.0.0","engine_version":"17","falco_version":"0.35.1","libs_version":"0.11.3","plugin_api_version":"3.0.0"}
System info:
{ "machine": "x86_64", "nodename": "falco-auditing-56bdb4c9b6-5wbjr", "release": "4.18.0-348.el8.0.2.x86_64", "sysname": "Linux", "version": "#1 SMP Sun Nov 14 00:51:12 UTC 2021" }
Cloud provider or hardware configuration:
OS: Redhat 8.5
Kernel:
4.18.0-348.el8.0.2.x86_64
Installation method:
Officinal Helm Charts on https://github.com/falcosecurity/charts
Additional context