falcosecurity / falco

Cloud Native Runtime Security
https://falco.org
Apache License 2.0
7.11k stars 876 forks source link

Falco pod failure for GKE 5.10.133+ kernel #2330

Closed ronniee007 closed 1 year ago

ronniee007 commented 1 year ago

Describe the bug

Helm chart installation fails for container optimized OS for GKE kernel 5.10.133+

How to reproduce it

helm install falco-gke falcosecurity/falco --set driver.kind=ebpf
NAME: falco-gke
LAST DEPLOYED: Fri Dec 16 04:20:38 2022
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
Falco agents are spinning up on each node in your cluster. After a few
seconds, they are going to start monitoring your containers looking for
security issues.

Pod deployment is still unstable after doing a shell to one of the pods below is the error I am seeing. After hitting the driver url https://download.falco.org/driver/3.0.1%2Bdriver/x86_64/falco_cos_5.10.133%2B_1.o I see the file is missing.

* Filename 'falco_cos_5.10.133+_1.o' is composed of:
 - driver name: falco
 - target identifier: cos
 - kernel release: 5.10.133+
 - kernel version: 1
* Trying to download a prebuilt eBPF probe from https://download.falco.org/driver/3.0.1%2Bdriver/x86_64/falco_cos_5.10.133%2B_1.o
curl: (7) Failed to connect to download.falco.org port 443: Connection timed out
Unable to find a prebuilt falco eBPF probe
* COS detected (build 16623.227.33), using COS kernel headers
* Found kernel config at /proc/config.gz
* Downloading https://storage.googleapis.com/cos-tools/16623.227.33/kernel-headers.tgz
* Setting up /usr/src links from host
* Running falco-driver-loader for: falco version=0.33.0, driver version=3.0.1+driver, arch=x86_64, kernel release=5.10.133+, kernel version=1
* Running falco-driver-loader with: driver=bpf, compile=yes, download=yes
* Mounting debugfs
mount: /sys/kernel/debug: cannot mount nodev read-only.
* Filename 'falco_cos_5.10.133+_1.o' is composed of:
 - driver name: falco
 - target identifier: cos
 - kernel release: 5.10.133+
 - kernel version: 1

Screenshots

Screenshot 2022-12-16 at 4 40 20 AM

Environment

falco version=0.33.0, driver version=3.0.1+driver, arch=x86_64, kernel release=5.10.133+, kernel version=1

Additional context

Tried the driverkit repo build as well but still failing with below error:

go run main.go docker -c gke-driver.yaml 
INFO using config file                             file=gke-driver.yaml
ERRO error validating build options                error="target must be a valid target ([fedora vanilla amazonlinux2 debian centos rocky ubuntu almalinux amazonlinux photon redhat arch opensuse minikube amazonlinux2022 flatcar])"
FedeDP commented 1 year ago

Hi! Thanks for opening this issue!

Tried the driverkit repo build as well but still failing with below error:

Yep, driverkit does not support cos target. But falco-driver-loader does not use driverkit; it just builds our bpf probe using clang. It seems like it is dying somewhere between:

I suspect the curl to be failing: https://github.com/falcosecurity/falco/blob/master/scripts/falco-driver-loader#L515. But i am able to fully download the kernel headers from the printed log (https://storage.googleapis.com/cos-tools/16623.227.33/kernel-headers.tgz), therefore i don't quite get why it is failing.

Perhaps mkdir /tmp/kernel is failing instead? https://github.com/falcosecurity/falco/blob/master/scripts/falco-driver-loader#L512

FedeDP commented 1 year ago

Btw we might try to add cos support in both https://github.com/falcosecurity/kernel-crawler and https://github.com/falcosecurity/driverkit, to start shipping prebuilt drivers for it!

ronniee007 commented 1 year ago

Thanks @FedeDP at this point is there any reference how I can create my own builds without the helper tools sysdig has today?

FedeDP commented 1 year ago

Mmmh is /tmp folder writable? Basically, the issue is somewhere here:

echo "* Downloading ${BPF_KERNEL_SOURCES_URL}"
mkdir -p /tmp/kernel
cd /tmp/kernel || exit
cd "$(mktemp -d -p /tmp/kernel)" || exit
if ! curl -L -o kernel-sources.tgz --create-dirs "${FALCO_DRIVER_CURL_OPTIONS}" "${BPF_KERNEL_SOURCES_URL}"; then
        >&2 echo "Unable to download the kernel sources"
        return
fi
echo "* Extracting kernel sources"

Since i was able to download the sources (https://storage.googleapis.com/cos-tools/16623.227.33/kernel-headers.tgz) with curl, i think that is not an issue, but you can easily check that out manually by issuing https://storage.googleapis.com/cos-tools/16623.227.33/kernel-headers.tgz from a node.

Thanks @FedeDP at this point is there any reference how I can create my own builds without the helper tools sysdig has today?

To create your own build you should build the driver on a node (ie: clone libs repo, checkout 3.0.0+driver tag, and follow https://github.com/falcosecurity/libs#build-ebpf-probe); note that you still need kernel-headers to be installed.

leogr commented 1 year ago

cc @alacuku

ronniee007 commented 1 year ago

No progress yet the COS restrictions are pretty daunting. Just curious whether the missing file from upstream was fixed?

alacuku commented 1 year ago

Hi @ronniee007, i tried to reproduce your issue by deploying a GKE cluster using the same OS version COS 97 LTS. The kernel version is 5.10.147+ and I'm not having issues. In my test I used v1.22.16-gke.1300 with image-type=cos_containerd that uses OS version COS 97 LTS with kernel version 5.10.147+.

As far as I know there is no way to use a specific image version in the node pools for a GKE cluster so I'm not able to use your same kernel version. Could you provide the k8s version?

poiana commented 1 year ago

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

poiana commented 1 year ago

Stale issues rot after 30d of inactivity.

Mark the issue as fresh with /remove-lifecycle rotten.

Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle rotten

leogr commented 1 year ago

Hey @ronniee007

Is this still an issue?

eljefedelrodeodeljefe commented 1 year ago

It is for us unfortunately, using the current falco and charts

alacuku commented 1 year ago

Hi @eljefedelrodeodeljefe, could you share the falco-driver-loader init container's logs?

ronniee007 commented 1 year ago

@leogr I can actually close it unless @eljefedelrodeodeljefe wants it to still keep it open.

poiana commented 1 year ago

Rotten issues close after 30d of inactivity.

Reopen the issue with /reopen.

Mark the issue as fresh with /remove-lifecycle rotten.

Provide feedback via https://github.com/falcosecurity/community. /close

poiana commented 1 year ago

@poiana: Closing this issue.

In response to [this](https://github.com/falcosecurity/falco/issues/2330#issuecomment-1603855585): >Rotten issues close after 30d of inactivity. > >Reopen the issue with `/reopen`. > >Mark the issue as fresh with `/remove-lifecycle rotten`. > >Provide feedback via https://github.com/falcosecurity/community. >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
poiana commented 3 days ago

@hendraput: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to [this](https://github.com/falcosecurity/falco/issues/2330#issuecomment-2217886240): >/reopen > >any update for this issue ? > >I have same issue with container-optimized-os GKE version 1.27.11-gke.1062004 and kernel version 5.15.146+ > >``` > ERROR no supported driver found for distro: cos, kernelrelease , kernelversion #1 SMP > , arch x86_64 > INFO Running falcoctl driver install > ├ driver version: 7.2.0+driver > ├ driver type: modern_ebpf > ├ driver name: falco > ├ compile: true > ├ download: true > ├ target: cos > ├ arch: x86_64 > ├ kernel release: > └ kernel version: #1 SMP Sat Feb 17 13:12:02 UTC 2024 >``` > Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.