Closed nc-pnan closed 2 months ago
@nc-pnan, does falco log which rule is triggered when failing? Any info on how to reproduce would be helpful.
@alacuku I unfortunately don't really have other information on how to reproduce than this, since the cluster it was deployed to is fairly extensive. But if there are any specifics you are interested in, please let me know.
The only triggered rules I can find currently is this one:
{"hostname":"falco-wm9ks","output":"13:09:22.412554500: Notice Unexpected connection to K8s API Server from container (connection=10.244.1.126:45008->10.16.0.1:443 lport=443 rport=45008 fd_type=ipv4 fd_proto=fd.l4proto evt_type=connect user= user_uid=4294967295 user_loginuid=-1 process=<NA> proc_exepath= parent=<NA> command=<NA> terminal=0 container_id= container_image=<NA> container_image_tag=<NA> container_name=<NA> k8s_ns=<NA> k8s_pod_name=<NA>)","priority":"Notice","rule":"Contact K8S API Server From Container","source":"syscall","tags":["T1565","container","k8s","maturity_stable","mitre_discovery","network"],"time":"2024-02-08T13:09:22.412554500Z", "output_fields": {"container.id":"","container.image.repository":null,"container.image.tag":null,"container.name":null,"evt.time":1707397762412554500,"evt.type":"connect","fd.lport":443,"fd.name":"10.244.1.126:45008->10.16.0.1:443","fd.rport":45008,"fd.type":"ipv4","k8s.ns.name":null,"k8s.pod.name":null,"proc.cmdline":"<NA>","proc.exepath":"","proc.name":"<NA>","proc.pname":null,"proc.tty":0,"user.loginuid":-1,"user.name":"","user.uid":4294967295}}
However, we did also get triggers on FalcoExporterAbsent, but this currently is not being triggered for some reason, even though the exporter is in CrashLoopBackoffState.
name: [FalcoExporterAbsent](http://localhost:10902/graph?g0.expr=ALERTS%7Balertname%3D%22FalcoExporterAbsent%22%7D&g0.tab=1&g0.stacked=0&g0.range_input=1h)
expr: [absent(up{job="falco-falco-exporter"})](http://localhost:10902/graph?g0.expr=absent(up%7Bjob%3D%22falco-falco-exporter%22%7D)&g0.tab=1&g0.stacked=0&g0.range_input=1h)
for: 10m
labels:
prometheus: monitoring/prometheus-default-prometheus
prometheus_replica: prometheus-prometheus-default-prometheus-0
severity: critical
annotations:
description: No metrics are being scraped from falco. No events will trigger any alerts.
summary: Falco Exporter has dissapeared from Prometheus service discovery.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Provide feedback via https://github.com/falcosecurity/community.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten
.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close
.
Provide feedback via https://github.com/falcosecurity/community.
/lifecycle rotten
/remove-lifecycle rotten Hey! This should be fixed in the latest Falco release 0.38.0! this should be the fix https://github.com/falcosecurity/libs/pull/1800
Hey! This should be fixed in the latest Falco release 0.38.0! this should be the fix falcosecurity/libs#1800
This has been fixed by 0.38 AFAIK. So, /close
@leogr: Closing this issue.
Describe the bug
We are deploying Falco with sidekick and exporter using these helm charts with daemonset, creating falco instances on 3 nodes. For 2 nodes everything is running without issues, but for the 3rd node, the falco exporter pod keeps failing in CrashLoopBackOff state. Inspecting the log of the falco-exporter container this is the output containing the error message:
We get the following error from the Falco pod itself:
[libprotobuf ERROR google/protobuf/wire_format_lite.cc:577] String field 'falco.outputs.response.OutputFieldsEntry.value' contains invalid UTF-8 data when serializing a protocol buffer. Use the 'bytes' type if you intend to send raw bytes.
We have tried redeploying the charts several times and it is always the instance connected to one specific node that is failing, but we have not been able to figure out the issue on our end, since all nodes should be configured identically.
How to reproduce it Deploy the Falco, Falco sidekick and Falco exporter charts with this umbrella Chart and values.yaml configuration, to an AKS cluster running 3 nodes:
Chart.yaml:
Values.yaml:
Expected behaviour
We expect the falco exporter to be running on all three nodes.
Screenshots None.
Environment
Falco version: 0.37.0 (x86_64) Falco initialized with configuration file: /etc/falco/falco.yaml System info: Linux version 5.15.138.1-4.cm2 (root@CBL-Mariner) (gcc (GCC) 11.2.0, GNU ld (GNU Binutils) 2.37) #1 SMP Thu Nov 30 21:48:10 UTC 2023 Loading rules from file /etc/falco/falco_rules.yaml { "machine": "x86_64", "nodename": "falco-dnncw", "release": "5.15.138.1-4.cm2", "sysname": "Linux", "version": "#1 SMP Thu Nov 30 21:48:10 UTC 2023" }
PRETTY_NAME="Debian GNU/Linux 12 (bookworm)" NAME="Debian GNU/Linux" VERSION_ID="12" VERSION="12 (bookworm)" VERSION_CODENAME=bookworm ID=debian