Closed singhnix closed 1 year ago
I'm getting this error when starting guard-duty agent (v1.3.1-eksbuild.1) on EKS cluster (1.28), on a node that runs Bottlerocket OS (1.16.1):
libbpf: failed to find valid kernel BTF
libbpf: Error loading vmlinux BTF: -3
libbpf: failed to load object 'patrol_bpf'
libbpf: failed to load BPF skeleton 'patrol_bpf': -3
It seems like the kernel is missing the "patrol_bpf" module
@kmute90
You need to have your kernel with CONFIG_DEBUG_INFO_BTF=y compiled. I assume that should just works with any modification, but still want to confirm.
You can check if your kernel has BTF built-int by:
# ls -la /sys/kernel/btf/vmlinux
Or by:
cat /boot/config-$(uname -r) | grep CONFIG_DEBUG_INFO_BTF
You should have the vmlinux file and have CONFIG_DEBUG_INFO_BTF=y.
Thank you for your response.
I checked it both in Bottlerocket github, the config should be compiled: https://github.com/bottlerocket-os/bottlerocket/pull/799
I run the commands that you suggested and results were:
root@admin]# ls -la /sys/kernel/btf/vmlinux
-r--r--r--. 1 root root 4601363 Nov 23 09:49 /sys/kernel/btf/vmlinux
bash-5.1# cat /boot/config | grep CONFIG_DEBUG_INFO_BTF
CONFIG_DEBUG_INFO_BTF=y
CONFIG_DEBUG_INFO_BTF_MODULES=y
@kmute90 We have internally reproduced and fixed the issue, and it will come with the next aws-guardduty-agent release.
Nice thank you!! Do you have an ETA for the next release?
@weitsochen wrote:
We have internally reproduced and fixed the issue, and it will come with the next aws-guardduty-agent release.
Can you provide more details about the issue you fixed?
Does it manifest in the official AWS images?
Is it specific to EKS 1.28?
Community Note
Tell us about your request
As guardduty addon has been released for the EKS clusters, it is noted that it is not working for bottlerocket nodes.
Which service(s) is this request for? EKS
Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? As guardduty addon has been released for the EKS clusters, it is noted that it is not working for bottlerocket addons.
Are you currently working around this issue? NO workaround
Additional context
Below steps I followed to install addon for EKS cluster
kubectl get pods -n amazon-guardduty
NAME READY STATUS RESTARTS AGE aws-guardduty-agent-rq2fp 0/1 CrashLoopBackOff 7 (49s ago) 11m
kubectl describe pods aws-guardduty-agent-rq2fp -n amazon-guardduty Name: aws-guardduty-agent-rq2fp Namespace: amazon-guardduty Priority: 0 Node: ip-192-168-126-188.ec2.internal/192.168.126.188 Start Time: Wed, 05 Apr 2023 12:11:04 +0530 Labels: app.kubernetes.io/name=aws-guardduty-agent controller-revision-hash=5f98984754 pod-template-generation=1 Annotations: kubernetes.io/psp: eks.privileged Status: Running IP: 192.168.126.188 IPs: IP: 192.168.126.188 Controlled By: DaemonSet/aws-guardduty-agent Containers: aws-guardduty-agent: Container ID: containerd://9a0b0b09d783834b63cd9a2bc10a6230c7c7147e5b1e7b8a7088d3f69531d619 Image: 031903291036.dkr.ecr.us-east-1.amazonaws.com/aws-guardduty-agent:v1.0.0 Image ID: 031903291036.dkr.ecr.us-east-1.amazonaws.com/aws-guardduty-agent@sha256:e38bdd2b1323e89113f1a31bd4bc8e5a8098525dd98e6981a28b9906b1e4411e Port:
Host Port:
State: Waiting
Reason: RunContainerError
Last State: Terminated
Reason: StartError
Message: failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: can't set process label: open /proc/thread-self/attr/exec: read-only file system: unknown
Exit Code: 128
Started: Thu, 01 Jan 1970 05:30:00 +0530
Finished: Wed, 05 Apr 2023 12:14:07 +0530
Ready: False
Restart Count: 5
Limits:
memory: 1Gi
Requests:
memory: 256Mi
Environment:
CLUSTER_NAME: eksdemo1
Mounts:
/proc from host-proc (ro)
/run/containerd/containerd.sock from containerd-sock (ro)
/run/docker.sock from docker-sock (ro)
/sys/kernel/debug from host-kernel-debug (ro)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-hpthb (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
docker-sock:
Type: HostPath (bare host directory volume)
Path: /var/run/docker.sock
HostPathType:
DownwardAPI: true
QoS Class: Burstable
Node-Selectors:
Tolerations: node.kubernetes.io/disk-pressure:NoSchedule op=Exists
node.kubernetes.io/memory-pressure:NoSchedule op=Exists
node.kubernetes.io/network-unavailable:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists
node.kubernetes.io/pid-pressure:NoSchedule op=Exists
node.kubernetes.io/unreachable:NoExecute op=Exists
node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
Type Reason Age From Message
containerd-sock: Type: HostPath (bare host directory volume) Path: /var/run/containerd/containerd.sock HostPathType:
host-proc: Type: HostPath (bare host directory volume) Path: /proc HostPathType:
host-kernel-debug: Type: HostPath (bare host directory volume) Path: /sys/kernel/debug HostPathType:
kube-api-access-hpthb: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional:
Normal Scheduled 3m15s default-scheduler Successfully assigned amazon-guardduty/aws-guardduty-agent-sgqhp to ip-192-168-126-188.ec2.internal Normal Pulling 3m14s kubelet Pulling image "031903291036.dkr.ecr.us-east-1.amazonaws.com/aws-guardduty-agent:v1.0.0" Normal Pulled 3m9s kubelet Successfully pulled image "031903291036.dkr.ecr.us-east-1.amazonaws.com/aws-guardduty-agent:v1.0.0" in 4.984017648s Normal Created 93s (x5 over 3m9s) kubelet Created container aws-guardduty-agent Warning Failed 93s (x5 over 3m8s) kubelet Error: failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: can't set process label: open /proc/thread-self/attr/exec: read-only file system: unknown Normal Pulled 93s (x4 over 3m7s) kubelet Container image "031903291036.dkr.ecr.us-east-1.amazonaws.com/aws-guardduty-agent:v1.0.0" already present on machine Warning BackOff 92s (x9 over 3m6s) kubelet Back-off restarting failed container
Tried to workaround by making /proc as readOnly: false but still pod failed with crashloopback with below error:
kubectl logs aws-guardduty-agent-xxxxx-n amazon-guardduty
2023-04-05T07:04:12.822273Z INFO amzn_guardduty_agent: GuardDuty agent starting with 8 worker thread(s) and 100 max blocking threads. 2023-04-05T07:04:12.984655Z INFO amzn_guardduty_agent: Agent fingerprint: f3962a7b731cfd20ce9570140f7d481102c18ca98465439ddb299a047f6ef95e 2023-04-05T07:04:12.985831Z ERROR amzn_guardduty_agent: Dependency check failed - Invalid kernel version 5.15.90 Error: Pipeline(DependencyError("Invalid kernel version 5.15.90"))
Now, from https://docs.aws.amazon.com/guardduty/latest/ug/guardduty-eks-runtime-monitoring.html#eksrunmon-verified-platform document, kernel 5.15 is not supported.
Hence, I think feature request should be created to have this addon for botterocket and kernel 5.15 support. Attachments If you think you might have additional information that you'd like to include via an attachment, please do - we'll take a look. (Remember to remove any personally-identifiable information.)