Closed danieljmt closed 1 year ago
Hi! Thanks for opening this bug report! Mind to share how you are starting Falco? I mean, through rpm package? Through Falco docker image?
We're building our own docker image which installs it via rpm (as seen below), and then runs /usr/bin/falco-driver-loader
and /usr/bin/falco -v --cri /run/containerd/containerd.sock
rpm --import https://falco.org/repo/falcosecurity-3672BA8F.asc
curl -s -o /etc/yum.repos.d/falcosecurity.repo https://falco.org/repo/falcosecurity-rpm.repo
dnf install -y /tmp/rpms/*
dnf -y install falco
dnf upgrade -y
Thanks for the info!
So, can you issue while the kmod is loaded:
ls -la /sys/module/falco/parameters/g_buffer_bytes_dim
And paste here the output?
[root@ source]# ls -la /sys/module/falco/parameters/g_buffer_bytes_dim
ls: cannot access '/sys/module/falco/parameters/g_buffer_bytes_dim': No such file or directory
Ok thanks! Care to share what modinfo falco
shows?
Btw are you starting the docker image in privileged mode? Basically, Falco needs rw access to /sys/module/falco
folder.
Yeah, we've changed nothing recently after it previously working for years and this has only come up in the last couple weeks
[root source]# modinfo falco
filename: /lib/modules/4.18.0-372.26.1.el8_6.x86_64/extra/falco.ko.xz
schema_version: 2.0.0
api_version: 1.0.0
build_commit:
version: 2.0.0+driver
author: the Falco authors
license: GPL
rhelversion: 8.6
srcversion: 8150520A5FB9859E9734934
depends:
name: falco
vermagic: 4.18.0-372.26.1.el8_6.x86_64 SMP mod_unload modversions
sig_id: PKCS#7
signer: DKMS module signing key
sig_key: 62:BD:74:D0:45:10:7C:7D:2B:34:1B:06:AF:BA:89:B4:3B:40:8F:20
sig_hashalgo: sha512
signature: 74:7A:43:85:00:26:0C:B2:07:DD:79:E6:7E:D4:93:60:18:B8:AC:1A:
95:A6:68:0C:10:D8:A2:BB:46:BB:AC:E7:06:F7:B2:28:C4:6B:80:B5:
1B:5B:CF:77:3F:A0:0A:20:C2:3D:5A:93:18:E3:26:8B:32:E8:3E:56:
F0:17:43:52:74:B7:C2:29:60:0F:AD:D4:D6:E3:43:A4:25:72:9D:E3:
4C:BD:EA:D6:29:3E:EC:EB:8E:3B:69:04:FD:8D:02:D8:D6:E0:77:4D:
14:57:8F:E2:75:CA:E1:FB:45:63:DC:50:98:97:2F:56:62:6E:19:C6:
C5:1F:93:70:27:D5:3E:85:B7:3C:F7:DA:F6:38:C7:D8:8B:27:B6:CE:
25:CE:59:5F:95:96:F2:8D:86:F5:2B:14:E9:CD:37:00:13:A4:D9:09:
17:6F:54:69:56:6F:EF:67:82:45:DF:F8:7A:60:20:25:9D:87:83:0F:
6E:8A:4D:22:8A:2A:85:6C:30:C0:5B:D2:E5:D9:9C:93:F9:C3:D0:3B:
79:7B:7B:C3:1B:85:2E:29:85:5A:13:79:6D:E8:22:0B:DF:2D:D4:A6:
68:48:71:26:44:CA:32:CC:27:1C:96:86:E6:E5:C0:12:23:BD:69:26:
E5:BD:7C:88:A5:E2:4F:FA:DE:40:08:5A:3F:F2:38:34
parm: max_consumers:Maximum number of consumers that can simultaneously open the devices (uint)
parm: verbose:Enable verbose logging (bool)
So, you are using the 2.0.0+driver
but Falco 0.33 needs the 3.0.1+driver
unfortunately.
Creating symlink /var/lib/dkms/falco/3.0.1+driver/source -> /usr/src/falco-3.0.1+driver
- Running dkms build failed, couldn't find /var/lib/dkms/falco/3.0.1+driver/build/make.log (with GCC /usr/bin/gcc)
- Trying to load a system falco module, if present
- Success: falco module found and loaded with modprobe
Indeed it is loading the old system module that is the 2.0.0 one. I think you need to install kernel src, given this error message:
Module build for the currently running kernel was skipped since the kernel source for this kernel does not seem to be installed.
That's odd, from this message it'd suggest it's 3.0.1+driver
[root /]# /usr/bin/falco-driver-loader --compile
* Running falco-driver-loader for: falco version=0.33.0, driver version=3.0.1+driver, arch=x86_64, kernel release=4.18.0-372.26.1.el8_6.x86_64, kernel version=1
* Running falco-driver-loader with: driver=module, compile=yes, download=no
================ Cleaning phase ================
* 1. Check if kernel module 'falco' is still loaded:
- Kernel module 'falco' is still loaded.
- Trying to unload it with 'rmmod falco'...
- OK! Unloading 'falco' module succeeded.
* 2. Check all versions of kernel module 'falco' in dkms:
- There are some versions of 'falco' module in dkms.
* 3. Removing all the following versions from dkms:
3.0.1+driver
- Removing 3.0.1+driver...
Deleting module falco-3.0.1+driver completely from the DKMS tree.
- OK! Removing '3.0.1+driver' succeeded.
[SUCCESS] Cleaning phase correctly terminated.
================ Cleaning phase ================
* Looking for a falco module locally (kernel 4.18.0-372.26.1.el8_6.x86_64)
* Filename 'falco_rhel_4.18.0-372.26.1.el8_6.x86_64_1.ko' is composed of:
- driver name: falco
- target identifier: rhel
- kernel release: 4.18.0-372.26.1.el8_6.x86_64
- kernel version: 1
* Trying to dkms install falco module with GCC /usr/bin/gcc
Sign command: /lib/modules/4.18.0-372.26.1.el8_6.x86_64/build/scripts/sign-file
Binary /lib/modules/4.18.0-372.26.1.el8_6.x86_64/build/scripts/sign-file not found, modules won't be signed
DIRECTIVE: MAKE="'/tmp/falco-dkms-make'"
Creating symlink /var/lib/dkms/falco/3.0.1+driver/source -> /usr/src/falco-3.0.1+driver
* Running dkms build failed, couldn't find /var/lib/dkms/falco/3.0.1+driver/build/make.log (with GCC /usr/bin/gcc)
* Trying to load a system falco module, if present
* Success: falco module found and loaded with modprobe
We're not explicitly installing 2.0.0+driver anwyehre
It tries to build the 3.0.1 driver, but it cannot because you miss kernel src/headers for your currently running kernel:
- Running dkms build failed, couldn't find /var/lib/dkms/falco/3.0.1+driver/build/make.log (with GCC /usr/bin/gcc)
Then, it fallbacks at loading a system falco module, if found (and of course, it finds the old 2.0.0 one!).
[root source]# rpm -qa | grep kernel
kernel-headers-4.18.0-372.32.1.el8_6.x86_64
kernel-devel-4.18.0-372.32.1.el8_6.x86_64
Is kernel-devel-4.18.0-372.32.1.el8_6.x86_64
not the right package? Is there something else I'm missing?
It seems like you are running 4.18.0-372.26.1.el8_6.x86_64
instead; most probably, you upgraded your system, that installed new kernel-devel and kernel, but you haven't rebooted to fully update the running kernel!
Are we talking the system the pod is running on here, or the container?
The system and the container share the same kernel, so you need to reboot the system one, given that it had its kernel updated ;)
I rebooted the node, no improvement. Is there a way to get it to load the system falco module @ 3.0.1?
Can you share uname -r
now?
No change, I think that's the intended version for the node. Don't think there's been any issues with the desync in versions in the past.
4.18.0-372.26.1.el8_6.x86_64
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
10.95.190.46 Ready <none> 90d v1.23.12+IKS 10.95.190.46 150.238.203.211 Red Hat Enterprise Linux 8.6 (Ootpa) 4.18.0-372.26.1.el8_6.x86_64 cri-o://1.23.3-17.rhaos4.10.git016b1ca.el8
That's weird though, because you have got kernel headers and devel for the wrong version installed:
kernel-headers-4.18.0-372.32.1.el8_6.x86_64
kernel-devel-4.18.0-372.32.1.el8_6.x86_64
You need to either upgrade the running kernel (to match the kernel headers version) or downgrade the headers.
The headers were installed during the docker image build, I'm not sure if they're even installed on the node (nor do I have access to it to check, or install). Is there any other way around it that doesn't involve the kernel headers?
Nope, unfortunately kernel-headers are mandatory to build a module against a kernel!
Is there a way of making it use the system kernel headers over the container ones, or using a prebuilt module? (I assume 2.0.0+driver was prebuilt?)
Just trying to avoid having to keep our container packages in sync with the system, as we have a large estate with quite complex deployments. Also very odd that this has never cropped up before - I assume the 2.0.0+driver was being used and it's now not backwards compatible?
[root /]# uname -r
4.18.0-372.26.1.el8_6.x86_64
[root /]# rpm -qa | grep kernel
kernel-tools-libs-4.18.0-372.26.1.el8_6.x86_64
kernel-4.18.0-372.26.1.el8_6.x86_64
kernel-core-4.18.0-372.26.1.el8_6.x86_64
kernel-devel-4.18.0-372.26.1.el8_6.x86_64
kernel-modules-4.18.0-372.26.1.el8_6.x86_64
kernel-tools-4.18.0-372.26.1.el8_6.x86_64
kernel-headers-4.18.0-372.26.1.el8_6.x86_64
That's what's installed at the system level
Of course you can! You must share /lib/modules volume from host:
-v /lib/modules:/host/lib/modules
Thank you! I'll give that a go and report back 😄
Oh looks like we were already doing that actually. Any ideas how to get it using the right one?
[root modules]# pwd
/lib/modules
[root modules]# ls
4.18.0-372.19.1.el8_6.x86_64 4.18.0-372.26.1.el8_6.x86_64
[root modules]# cd /host/lib/modules/
[root modules]# ls
4.18.0-372.19.1.el8_6.x86_64 4.18.0-372.26.1.el8_6.x86_64
...
volumes:
- name: lib-modules
hostPath:
path: /lib/modules
...
containers:
env:
- name: HOST_ROOT
value: "/host"
securityContext:
privileged: true
...
volumeMounts:
- mountPath: /host/lib/modules
name: lib-modules
readOnly: false
...
Wow then it should work :/ HOST_ROOT
is also set.
Can you share ls -la /host/lib/modules/4.18.0-372.26.1.el8_6.x86_64/build
?
lrwxrwxrwx. 1 root root 45 Aug 27 07:05 /host/lib/modules/4.18.0-372.26.1.el8_6.x86_64/build -> /usr/src/kernels/4.18.0-372.26.1.el8_6.x86_64
[root@kube-bmtl9fnd0adq142d4430-devussouthc-kuberes-000025f5 kernels]# ls /usr/src/kernels/
4.18.0-372.32.1.el8_6.x86_64 kernels
Doesn't appear in that folder, though
How come falco-driver-loader cleans up any compiled drivers? As far as I can tell the driver is compiled when we do dnf -y install falco
(which we do during the docker build stage)
Is there anuything under the /usr/src/kernels/4.18.0-372.32.1.el8_6.x86_64/
folder?
How come falco-driver-loader cleans up any compiled drivers?
Yep and it gets recompiled every time Falco is upgraded, when it needs to build a new driver; that's why it cleans up old driver.
[root kernels]# ls -la /usr/src/kernels/4.18.0-372.32.1.el8_6.x86_64/
total 8152
drwxr-xr-x. 23 root root 4096 Nov 4 14:21 .
drwxr-xr-x. 1 root root 4096 Nov 4 14:25 ..
-rw-r--r--. 1 root root 196170 Oct 7 17:11 .config
-rw-r--r--. 1 root root 600 Oct 7 17:11 Kconfig
-rw-r--r--. 1 root root 388 Oct 7 17:11 Kconfig.redhat
-rw-r--r--. 1 root root 61608 Oct 7 17:11 Makefile
-rw-r--r--. 1 root root 2337 Oct 7 17:11 Makefile.rhelver
-rw-r--r--. 1 root root 1366972 Oct 7 17:11 Module.symvers
-rw-r--r--. 1 root root 4362925 Oct 7 17:11 System.map
drwxr-xr-x. 26 root root 4096 Nov 4 14:21 arch
drwxr-xr-x. 3 root root 4096 Nov 4 14:21 block
drwxr-xr-x. 2 root root 4096 Nov 4 14:21 certs
drwxr-xr-x. 4 root root 4096 Nov 4 14:21 crypto
drwxr-xr-x. 137 root root 4096 Nov 4 14:21 drivers
drwxr-xr-x. 2 root root 4096 Nov 4 14:21 firmware
drwxr-xr-x. 73 root root 4096 Nov 4 14:21 fs
drwxr-xr-x. 32 root root 4096 Nov 4 14:21 include
drwxr-xr-x. 2 root root 4096 Nov 4 14:21 init
drwxr-xr-x. 2 root root 4096 Nov 4 14:21 ipc
drwxr-xr-x. 18 root root 4096 Nov 4 14:21 kernel
drwxr-xr-x. 21 root root 4096 Nov 4 14:21 lib
drwxr-xr-x. 3 root root 4096 Nov 4 14:21 mm
drwxr-xr-x. 72 root root 4096 Nov 4 14:21 net
drwxr-xr-x. 27 root root 4096 Nov 4 14:21 samples
drwxr-xr-x. 13 root root 4096 Nov 4 14:21 scripts
drwxr-xr-x. 11 root root 4096 Nov 4 14:21 security
drwxr-xr-x. 27 root root 4096 Nov 4 14:21 sound
drwxr-xr-x. 29 root root 4096 Nov 4 14:21 tools
drwxr-xr-x. 2 root root 4096 Nov 4 14:21 usr
drwxr-xr-x. 4 root root 4096 Nov 4 14:21 virt
-rw-r--r--. 1 root root 2230264 Oct 7 17:11 vmlinux.h
-rw-r--r--. 1 root root 41 Oct 7 17:11 vmlinux.id
This is our run script btw
KERNEL_SOURCE="$HOST_ROOT/usr/src"
# shellcheck source=/dev/null
source "$HOST_ROOT/etc/os-release"
if [ "$PRETTY_NAME" == "Red Hat" ]
then
KERNEL_SOURCE="$HOST_ROOT/usr/src/kernels"
for i in "$KERNEL_SOURCE"/*
do
base=$(basename "$i")
ln -s "$i" "/usr/src/kernels/$base"
done
fi
for i in "$KERNEL_SOURCE"/*
do
base=$(basename "$i")
ln -s "$i" "/usr/src/$base"
done
for i in "$HOST_ROOT/lib/modules"/*
do
base=$(basename "$i")
ln -s "$i" "/lib/modules/$base"
done
/usr/bin/falco-driver-loader
Uh i just noticed that we need 4.18.0-372.26.1.el8_6.x86_64
kernel sources, but you have
ls /usr/src/kernels/ 4.18.0-372.32.1.el8_6.x86_64 kernels
It seems you are missing sources package for 4.18.0-372.26!
So I've had some progress. The nodes have now been updated to match the version on the container (not going to be the long term solution). This gets the module building correctly, but it seems to have issues unloading the falco module before building and insmoding the module when built. Any ideas what would cause this? Afaik falco is not running at this moment in time. It also appears that falco starts after the module build regardless, but falco exporter never reports as ready.
* Running falco-driver-loader for: falco version=0.33.0, driver version=3.0.1+driver, arch=x86_64, kernel release=4.18.0-372.32.1.el8_6.x86_64, kernel version=1
* Running falco-driver-loader with: driver=module, compile=yes, download=yes
================ Cleaning phase ================
* 1. Check if kernel module 'falco' is still loaded:
- Kernel module 'falco' is still loaded.
- Trying to unload it with 'rmmod falco'...
- Nothing to do...'falco-driver-loader' will wait until you remove the kernel module to have a clean termination.
- Check that no process is using the kernel module with 'lsmod | grep falco'.
- Sleep 5 seconds...
- Kernel module 'falco' is still loaded.
- Trying to unload it with 'rmmod falco'...
- Nothing to do...'falco-driver-loader' will wait until you remove the kernel module to have a clean termination.
- Check that no process is using the kernel module with 'lsmod | grep falco'.
- Sleep 5 seconds...
- Kernel module 'falco' is still loaded.
- Trying to unload it with 'rmmod falco'...
- Nothing to do...'falco-driver-loader' will wait until you remove the kernel module to have a clean termination.
- Check that no process is using the kernel module with 'lsmod | grep falco'.
- Sleep 5 seconds...
- Kernel module 'falco' is still loaded.
- Trying to unload it with 'rmmod falco'...
- Nothing to do...'falco-driver-loader' will wait until you remove the kernel module to have a clean termination.
- Check that no process is using the kernel module with 'lsmod | grep falco'.
- Sleep 5 seconds...
- Kernel module 'falco' is still loaded.
- Trying to unload it with 'rmmod falco'...
- Nothing to do...'falco-driver-loader' will wait until you remove the kernel module to have a clean termination.
- Check that no process is using the kernel module with 'lsmod | grep falco'.
- Sleep 5 seconds...
- Kernel module 'falco' is still loaded.
- Trying to unload it with 'rmmod falco'...
- Nothing to do...'falco-driver-loader' will wait until you remove the kernel module to have a clean termination.
- Check that no process is using the kernel module with 'lsmod | grep falco'.
- Sleep 5 seconds...
- Kernel module 'falco' is still loaded.
- Trying to unload it with 'rmmod falco'...
- Nothing to do...'falco-driver-loader' will wait until you remove the kernel module to have a clean termination.
- Check that no process is using the kernel module with 'lsmod | grep falco'.
- Sleep 5 seconds...
- Kernel module 'falco' is still loaded.
- Trying to unload it with 'rmmod falco'...
- Nothing to do...'falco-driver-loader' will wait until you remove the kernel module to have a clean termination.
- Check that no process is using the kernel module with 'lsmod | grep falco'.
- Sleep 5 seconds...
- Kernel module 'falco' is still loaded.
- Trying to unload it with 'rmmod falco'...
- Nothing to do...'falco-driver-loader' will wait until you remove the kernel module to have a clean termination.
- Check that no process is using the kernel module with 'lsmod | grep falco'.
- Sleep 5 seconds...
- Kernel module 'falco' is still loaded.
- Trying to unload it with 'rmmod falco'...
- Nothing to do...'falco-driver-loader' will wait until you remove the kernel module to have a clean termination.
- Check that no process is using the kernel module with 'lsmod | grep falco'.
- Sleep 5 seconds...
[WARNING] 'falco' module is still loaded, you could have incompatibility issues.
* 2. Check all versions of kernel module 'falco' in dkms:
- There are some versions of 'falco' module in dkms.
* 3. Removing all the following versions from dkms:
3.0.1+driver
- Removing 3.0.1+driver...
Deleting module falco-3.0.1+driver completely from the DKMS tree.
- OK! Removing '3.0.1+driver' succeeded.
[SUCCESS] Cleaning phase correctly terminated.
================ Cleaning phase ================
* Looking for a falco module locally (kernel 4.18.0-372.32.1.el8_6.x86_64)
* Filename 'falco_rhel_4.18.0-372.32.1.el8_6.x86_64_1.ko' is composed of:
- driver name: falco
- target identifier: rhel
- kernel release: 4.18.0-372.32.1.el8_6.x86_64
- kernel version: 1
* Trying to download a prebuilt falco module from https://download.falco.org/driver/3.0.1%2Bdriver/x86_64/falco_rhel_4.18.0-372.32.1.el8_6.x86_64_1.ko
* Trying to dkms install falco module with GCC /usr/bin/gcc
Sign command: /lib/modules/4.18.0-372.32.1.el8_6.x86_64/build/scripts/sign-file
Signing key: /var/lib/dkms/mok.key
Public certificate (MOK): /var/lib/dkms/mok.pub
Certificate or key are missing, generating self signed certificate for MOK...
DIRECTIVE: MAKE="'/tmp/falco-dkms-make'"
Creating symlink /var/lib/dkms/falco/3.0.1+driver/source -> /usr/src/falco-3.0.1+driver
Building module:
Cleaning build area...
'/tmp/falco-dkms-make'.....
Signing module /var/lib/dkms/falco/3.0.1+driver/build/falco.ko
Cleaning build area...
falco.ko.xz:
Running module version sanity check.
- Original module
- Found /lib/modules/4.18.0-372.32.1.el8_6.x86_64/extra/falco.ko.xz
- Storing in /var/lib/dkms/falco/original_module/4.18.0-372.32.1.el8_6.x86_64/x86_64/
- Archiving for uninstallation purposes
- Installation
- Installing to /lib/modules/4.18.0-372.32.1.el8_6.x86_64/extra/
Adding any weak-modules
depmod....
* falco module installed in dkms
* falco module found: /var/lib/dkms/falco/3.0.1+driver/4.18.0-372.32.1.el8_6.x86_64/x86_64/module/falco.ko.xz
* Trying insmod
* Unable to insmod falco module
* Trying to load a system falco module, if present
* Success: falco module found and loaded with modprobe
Just managed to exec into the pod early enough and it looks like falco is running early for some reason, so I'll investigate that
Okay got it building, from our run script they changed the pretty name so "$PRETTY_NAME" == "Red Hat"
was no longer evaluating to true. However, not in the all clear yet. falco_exporter
doesn't seem to be able to talk to it on gRPC? It's enabled in the config.
[root /]# /usr/bin/falco-exporter
2022/11/10 13:06:57 connecting to gRPC server at unix:///run/falco/falco.sock (timeout 2m0s)
2022/11/10 13:06:57 listening on http://:9376/metrics
2022/11/10 13:08:57 gRPC: error dialing server: context deadline exceeded
grpc:
enabled: true
bind_address: "unix:///var/run/falco.sock"
threadiness: 0
grpc_output:
enabled: true
Hurray!!! Are you using latest falco-exporter release? We moved Falco socket to another location; you will need https://github.com/falcosecurity/falco-exporter/releases/tag/v0.8.0 (see https://github.com/falcosecurity/falco-exporter/commit/7aa0a13fcbcf69c4e6ee3bae21c281673e4dee9e commit)!
Oh wow, can't believe I missed that! Yep that's done the trick! Thanks so much for all your help, really appreciated 😄
You are welcome! I am sorry it took so long :(
Aha it's all good, turned out to all be user error in the end anyway!
Describe the bug Falco failing to start
How to reproduce it
Expected behaviour Falco should come up
Screenshots
Environment
{"machine":"x86_64","nodename":"###","release":"4.18.0-372.26.1.el8_6.x86_64","sysname":"Linux","version":"#1 SMP Sat Aug 27 02:44:20 EDT 2022"}
4.18.0-372.26.1.el8_6.x86_64 #1 SMP Sat Aug 27 02:44:20 EDT 2022 x86_64 x86_64 x86_64 GNU/Linux
Additional context