Falco 0.33.1 "node name does not correspond to a node in the cluster" during startup due to jq filter failure on NotReady node with status.addresses missing #2358
we've experienced whole DaemonSet failures (all new pods failing to start/restart) reporting errors like: Error fetching K8s data: Failing to enrich events with Kubernetes metadata: node name does not correspond to a node in the cluster: ip-xxx-yy-zzz-www.us-west-1.compute.internal. After some looking around and enabling libs_logger.enabled: true we've been able to narrow it down to: https://github.com/falcosecurity/libs/blob/01c07df720708f19b6ba3e2f6857bddb8c2c4779/userspace/libsinsp/socket_handler.h#L792
Describe the bug
When starting falco on EKS with:
we've experienced whole DaemonSet failures (all new pods failing to start/restart) reporting errors like:
Error fetching K8s data: Failing to enrich events with Kubernetes metadata: node name does not correspond to a node in the cluster: ip-xxx-yy-zzz-www.us-west-1.compute.internal
. After some looking around and enablinglibs_logger.enabled: true
we've been able to narrow it down to: https://github.com/falcosecurity/libs/blob/01c07df720708f19b6ba3e2f6857bddb8c2c4779/userspace/libsinsp/socket_handler.h#L792causing this error line:
While digging more, this failure is caused by a
NonReady
node being returned which does not present any.addresses
in the.status
field:example:
How to reproduce it
remove
status.addresses
field from a single k8s node returned by https://172.20.0.1/api/v1/nodes?pretty=falseExpected behaviour
such node should not prevent all other falco pods from starting
Environment