Falco Webhook getting an error - "http: request body too large"

antikilahdjs commented 1 year ago

Describe the bug

How to reproduce it

Install using the normal way to use k8s-audit. I used the official helm charts

Expected behaviour

In my lab everything works perfecly because I dont have a large environment but in my production I am facing the error about the body is too large then I had increased the 2 parameters to works correctly

maxEventSize: 134217728
webhookMaxBatchSize: 268435456

Then the POD memory increased to 38gb or more and the cores either, so I would like to know it is a bug or not.

My environment is too large but is so weird because I tested other applications and works around 12gb.

I would like to fix the error or if I did something wrong please help me on it.

Screenshots

Environment

Falco version:

Thu Sep 14 15:17:25 2023: Falco version: 0.35.1 (x86_64) Thu Sep 14 15:17:25 2023: Falco initialized with configuration file: /etc/falco/falco.yaml {"default_driver_version":"5.0.1+driver","driver_api_version":"4.0.0","driver_schema_version":"2.0.0","engine_version":"17","falco_version":"0.35.1","libs_version":"0.11.3","plugin_api_version":"3.0.0"}
System info:

{ "machine": "x86_64", "nodename": "falco-auditing-56bdb4c9b6-5wbjr", "release": "4.18.0-348.el8.0.2.x86_64", "sysname": "Linux", "version": "#1 SMP Sun Nov 14 00:51:12 UTC 2021" }
Cloud provider or hardware configuration:
OS: Redhat 8.5
Kernel:

4.18.0-348.el8.0.2.x86_64
Installation method:

Officinal Helm Charts on https://github.com/falcosecurity/charts

Additional context

  2023/09/14 15:10:45 [k8saudit] bad request: http: request body too large

 2023/09/14 15:10:57 [k8saudit] bad request: http: request body too large

 2023/09/14 15:10:59 [k8saudit] bad request: http: request body too large

 2023/09/14 15:11:00 [k8saudit] bad request: http: request body too large

 2023/09/14 15:11:04 [k8saudit] bad request: http: request body too large

 2023/09/14 15:11:05 [k8saudit] bad request: http: request body too large

 2023/09/14 15:11:14 [k8saudit] bad request: http: request body too large

 2023/09/14 15:11:16 [k8saudit] bad request: http: request body too large

 2023/09/14 15:11:21 [k8saudit] bad request: http: request body too large

 2023/09/14 15:11:23 [k8saudit] bad request: http: request body too large

 2023/09/14 15:11:26 [k8saudit] bad request: http: request body too large

 2023/09/14 15:11:35 [k8saudit] bad request: http: request body too large

 2023/09/14 15:11:35 [k8saudit] bad request: http: request body too large

 2023/09/14 15:11:40 [k8saudit] bad request: http: request body too large

 2023/09/14 15:11:43 [k8saudit] bad request: http: request body too large

 2023/09/14 15:11:44 [k8saudit] bad request: http: request body too large

 2023/09/14 15:11:51 [k8saudit] bad request: http: request body too large

 2023/09/14 15:11:56 [k8saudit] bad request: http: request body too large

 2023/09/14 15:12:00 [k8saudit] bad request: http: request body too large

 2023/09/14 15:12:01 [k8saudit] bad request: http: request body too large

 2023/09/14 15:12:03 [k8saudit] bad request: http: request body too large

 2023/09/14 15:12:12 [k8saudit] bad request: http: request body too large

 2023/09/14 15:12:14 [k8saudit] bad request: http: request body too large

 2023/09/14 15:12:16 [k8saudit] bad request: http: request body too large

 2023/09/14 15:12:17 [k8saudit] bad request: http: request body too large

 2023/09/14 15:12:22 [k8saudit] bad request: http: request body too large

 2023/09/14 15:12:31 [k8saudit] bad request: http: request body too large

 2023/09/14 15:12:32 [k8saudit] bad request: http: request body too large

 2023/09/14 15:12:36 [k8saudit] bad request: http: request body too large

 2023/09/14 15:12:39 [k8saudit] bad request: http: request body too large

 2023/09/14 15:12:43 [k8saudit] bad request: http: request body too large

 2023/09/14 15:12:49 [k8saudit] bad request: http: request body too large

 2023/09/14 15:12:54 [k8saudit] bad request: http: request body too large

 2023/09/14 15:12:55 [k8saudit] bad request: http: request body too large

 2023/09/14 15:12:58 [k8saudit] bad request: http: request body too large

 2023/09/14 15:13:07 [k8saudit] bad request: http: request body too large

 2023/09/14 15:13:12 [k8saudit] bad request: http: request body too large

 2023/09/14 15:13:14 [k8saudit] bad request: http: request body too large

 2023/09/14 15:13:14 [k8saudit] bad request: http: request body too large

 2023/09/14 15:13:17 [k8saudit] bad request: http: request body too large

 2023/09/14 15:13:29 [k8saudit] bad request: http: request body too large

 2023/09/14 15:13:34 [k8saudit] bad request: http: request body too large

 2023/09/14 15:13:35 [k8saudit] bad request: http: request body too large

 2023/09/14 15:13:38 [k8saudit] bad request: http: request body too large

 2023/09/14 15:13:40 [k8saudit] bad request: http: request body too large

 2023/09/14 15:13:51 [k8saudit] bad request: http: request body too large

 2023/09/14 15:13:52 [k8saudit] bad request: http: request body too large

 2023/09/14 15:13:58 [k8saudit] bad request: http: request body too large

 2023/09/14 15:14:01 [k8saudit] bad request: http: request body too large

 2023/09/14 15:14:04 [k8saudit] bad request: http: request body too large

 2023/09/14 15:14:13 [k8saudit] bad request: http: request body too large

 2023/09/14 15:14:14 [k8saudit] bad request: http: request body too large

Andreagit97 commented 1 year ago

ei thank you for reporting!

Then the POD memory increased to 38gb or more and the cores either, so I would like to know it is a bug or not.

Uhm it seems like a bug, we need to investigate more on this!

antikilahdjs commented 1 year ago

Thank you so much @Andreagit97. I will send below a screenshoot from real query in Prometheus. I included a resources limits to 42gb but if remove those limits it will be reach out more than 120gb

Start the auditing and in 3 minutes the memory reach out 22gb

Andreagit97 commented 1 year ago

Thank you for the additional data, right now we are a little bit busy but we will come to it after the falco release!

poiana commented 10 months ago

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

antikilahdjs commented 10 months ago

Not fixed

Andreagit97 commented 10 months ago

/remove-lifecycle stale

sboschman commented 8 months ago

You increased max eventsize to 134Gb and max webhook batch size to 268Gb? In which case the memory usage is sort of expected I guess, as up to 268GB of json has to be processed at once...

A few things you might experiment with:

limit the number of events in a single batch by setting the --audit-webhook-batch-max-size flag on your api server, you might need multiple falco instances to keep up with your audit event stream, as you mention having a large cluster
use the falco tailored audit-policy.yaml (docs) in case you are not already doing so, as the api server can generate massive amounts of audit events which are not all relevant to falco
as some events include the requestObject (e.g. a ConfigMap), you might be able to find the event which includes some massive k8s object and consider dropping it using the audit-policy.yaml

poiana commented 5 months ago

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

poiana commented 4 months ago

Stale issues rot after 30d of inactivity.

Mark the issue as fresh with /remove-lifecycle rotten.

Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle rotten

poiana commented 3 months ago

Rotten issues close after 30d of inactivity.

Reopen the issue with /reopen.

Mark the issue as fresh with /remove-lifecycle rotten.

Provide feedback via https://github.com/falcosecurity/community. /close

poiana commented 3 months ago

@poiana: Closing this issue.

In response to [this](https://github.com/falcosecurity/falco/issues/2808#issuecomment-2256333799): >Rotten issues close after 30d of inactivity. > >Reopen the issue with `/reopen`. > >Mark the issue as fresh with `/remove-lifecycle rotten`. > >Provide feedback via https://github.com/falcosecurity/community. >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.

falcosecurity / falco

Falco Webhook getting an error - "http: request body too large" #2808