falcosecurity / falco

Cloud Native Runtime Security
https://falco.org
Apache License 2.0
7.43k stars 903 forks source link

kmod gets compiled for the wrong kernel version #3164

Open johananl opened 7 months ago

johananl commented 7 months ago

Describe the bug

In https://github.com/falcosecurity/falco/pull/2728 we've added two flags to falco-driver-loader which allow specifying the kernel release and kernel version for which the driver should be compiled.

However, turns out that in order for a kmod to be compiled correctly, we also have to populate the --kernelsourcedir DKMS flag. Failing to do so results in the kmod being installed under the correct path but compiled for the currently-running kernel.

My apologies for the oversight!

How to reproduce it

Install the required packages for a kernel other than the one your Linux instance is currently running as well as the relevant kernel headers:

# Currently-running kernel is 5.15.0-1060-azure
kernel_release="5.15.0-1059-azure"
sudo apt install linux-image-${kernel_release} linux-headers-${kernel_release}

Don't reboot into the new kernel.

Compile the kmod driver for the newly-installed kernel:

kernel_version=$(sudo file /boot/vmlinuz-${kernel_release} | sed 's/.*\(#[^,]\+\).*/\1/')
sudo DRIVER_KERNEL_RELEASE="${kernel_release}" DRIVER_KERNEL_VERSION="${kernel_version}" falco-driver-loader --compile

Check the kernel version in the resulting .ko file:

modinfo "/lib/modules/${kernel_release}/updates/dkms/falco.ko" | grep vermagic

Output:

vermagic:       5.15.0-1060-azure SMP mod_unload modversions

Expected behaviour

The expected output is 5.15.0-1059-azure, however the actual version is the currently-running kernel.

Screenshots

Environment

Thu Apr 11 15:23:37 2024: Falco version: 0.36.0 (x86_64)
Thu Apr 11 15:23:37 2024: Falco initialized with configuration file: /etc/falco/falco.yaml
Falco version: 0.36.0
Libs version:  0.13.1
Plugin API:    3.1.0
Engine:        26
Driver:
  API version:    5.0.0
  Schema version: 2.0.0
  Default driver: 6.0.1+driver
Thu Apr 11 15:24:18 2024: Falco version: 0.36.0 (x86_64)
Thu Apr 11 15:24:18 2024: Falco initialized with configuration file: /etc/falco/falco.yaml
Thu Apr 11 15:24:18 2024: Loading rules from file /etc/falco/falco_rules.yaml
Thu Apr 11 15:24:18 2024: Loading rules from file /etc/falco/falco_rules.local.yaml
{
  "machine": "x86_64",
  "nodename": "falco-test",
  "release": "5.15.0-1060-azure",
  "sysname": "Linux",
  "version": "#69~20.04.1-Ubuntu SMP Tue Mar 19 22:14:45 UTC 2024"
}
NAME="Ubuntu"
VERSION="20.04.6 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.6 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal
Linux falco-test 5.15.0-1060-azure #69~20.04.1-Ubuntu SMP Tue Mar 19 22:14:45 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Additional context

This problem can be solved by passing --kernelsourcedir /usr/src/linux-headers-${kernel_release} to DKMS. In current Falco (v0.37.1) this can be done by setting the KERNELDIR env var. In Falco v0.36 this is probably unsupported since falco-driver-loader doesn't pass this flag to DKMS.

Is there any value in backporting a fix to v0.36, or should we address this only for Falco v0.37 and above?

FedeDP commented 7 months ago

HI @johananl ! Thanks for opening this issue!

However, turns out that in order for a kmod to be compiled correctly, we also have to populate the --kernelsourcedir DKMS flag. Failing to do so results in the kmod being installed under the correct path but compiled for the currently-running kernel

Oh yep, that makes sense. Please note that next falcoctl version (ie: v0.8.0, there is already an rc1 available) will implement 2 big new features for driver loader:

The latter overlaps a bit with your request since we will then automatically set KERNELDIR (and --kernelsourcedir for dkms; see https://github.com/falcosecurity/falcoctl/pull/476/files#diff-5f5269945d03266c023f8eff0f114f39d68d0fec1bcc9b57f386e309630f9db9R220).

As a side note, falcoctl driver loader is still designed around building drivers for the running kernel; if you instead want a generic solution, you should rely on driverkit itself: https://github.com/falcosecurity/driverkit

johananl commented 7 months ago

Thanks for the quick response @FedeDP!

I don't have a strong opinion on how to specify the kernel version, I was just looking for some deterministic way to get Falco working for an explicit kernel version (which might not be the running version at times). What would be the best way to achieve that?

FedeDP commented 7 months ago

which might not be the running version at times

Yep, then i think it would be better to use driverkit for that, as previously stated. That is the exact purpose of driverkit: we use it in CI to build the drivers you find on download.falco.org. You can then distribute them in your own drivers repo and override FALCOCTL_DRIVER_REPOS env variable while running falcoctl specifying your own repository. See https://falco.org/docs/install-operate/installation/#falco-binary for more info (near the end of the section).

NOTE: we can still support --kerneldir parameter in falcoctl if you wish, i just think it should not be the responsibility of falcoctl to manage those situations. Falcoctl should just be spinned on the node before Falco runs to install the driver.

FedeDP commented 7 months ago

I was thinking: isn't dkms going to use the correct /usr/src/linux-headers-${kernel_release} folder when run against a different kernelrelease/kernelvrsion? I mean, looking at the code from falcoctl: https://github.com/falcosecurity/falcoctl/blob/main/pkg/driver/type/kmod.go#L223

dkmsCmdArgs = fmt.Sprintf(`dkms install --directive="MAKE='/tmp/falco-dkms-make'" -m %q -v %q -k %q --verbose`,
                driverName, driverVersion, kr.String())

that in your example resolves to:

dkms install --directive="MAKE='/tmp/falco-dkms-make'" -m falco -v 6.0.1+driver -k 5.15.0-1059-azure --verbose

It feels weird that dkms then proceed to build against 5.15.0-1060-azure :/ We pass the -k flag right for that...

johananl commented 7 months ago

I'm no expert around DKMS, but what I'm observing in terms of its behavior is that the -k flag affects the path of the resulting .ko file but not the structure of the file, namely what kernel it's getting compiled for. When specifying just -k without --kernelsourcedir, I'm getting a file that's named after the specified kernel release but can't be loaded into that kernel (give "exec format error") and inspecting the .ko file with modinfo confirms that the kernel release is wrong.

FedeDP commented 7 months ago

Mmmh seeing the man page for dkms states:

k <kernel-version>/<arch>
              The kernel and arch to perform the action upon. You can  specify
              multiple  kernel  version/arch  pairs on the command line by re‐
              peating the -k argument with  a  different  kernel  version  and
              arch.  However, not all actions support multiple kernel versions
              (it will error out in this case).  The arch part can be omitted,
              and DKMS will assume you want it to be the arch of the currently
              running system.

and

build [module/module-version] [‐k kernel/arch] [‐‐force]

           Builds  the  specified  module/version combo for the specified ker‐
           nel/arch. If the -k option is not specified it builds for the  cur‐
           rently  running  kernel and arch. 

As far as i know, kernelsourcedir should only matter when you want to use a non-standard KERNELDIR folder. But i agree with you, your output states the opposite :/

--kernelsourcedir <kernel-source-directory-location>
              Using this option you can specify the location  of  your  kernel
              source  directory.  Most likely you will not need to set this if
              your kernel source is accessible  via  /lib/modules/$kernel_ver‐
              sion/build.
johananl commented 7 months ago

FWIW, I've already worked around the problem in my specific use case so I've opened this issue only for the sake of improving the Falco UX, especially given that I was the one who originally asked to add support for explicit kernel releases so I do feel some responsibility here :-)

My bottom line is, if explicit releases are supported, the behavior should be predictable rather than surprising. I'll leave it to you to decide how to proceed, but happy to provide more input/feedback if that helps.

FedeDP commented 7 months ago

My bottom line is, if explicit releases are supported, the behavior should be predictable rather than surprising. I'll leave it to you to decide how to proceed, but happy to provide more input/feedback if that helps.

I fully agree! Still trying to understand why dkms is doing something so weird (and against its own man pages!)

EDIT: i strongly believe we should not be doing anything special for dkms to work; instead, for bpf we might have a small bug since we don't actually use kernelrelease while building the probe: https://github.com/falcosecurity/falcoctl/blob/main/pkg/driver/type/bpf.go#L84

FedeDP commented 7 months ago

Oh wait, you are using falco-driver-loader:0.36.0?

johananl commented 7 months ago

Oh wait, you are using falco-driver-loader:0.36.0?

I was, yes, but I've observed the same behavior with falcoctl driver install. The problem apparently lies in how we invoke DKMS, regardless of the Falco tooling which does the invocation.

FedeDP commented 7 months ago

Found this: https://github.com/ntop/PF_RING/issues/487 and this: https://github.com/ntop/PF_RING/commit/08d3f9d75d1e1f1a832adcc6c0d31142a46a031b. Our dkms.conf looks instead: https://github.com/falcosecurity/libs/blob/master/driver/dkms.conf.in

Perhaps we miss that MAKE[0]="make BUILD_KERNEL=${kernelver}" line? Would you like to try it yourself? It should be as easy as go under /usr/src/falco-6.0.1+driver/ and update dkms.conf file there adding hte new line.

johananl commented 7 months ago

I can't invest more time in this at the moment @FedeDP. As far as I'm concerned we can close the issue since this specific problem is no longer affecting my use case. I opened it mainly to alert you guys to the fact that we may have a bug here, so I'll leave it up to you to decide if it's worth looking into.

Thank you for your help! :pray:

FedeDP commented 7 months ago

Thanks for reporting anyway!

FedeDP commented 6 months ago

/milestone TBD

poiana commented 3 months ago

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

FedeDP commented 3 months ago

/remove-lifecycle stale

poiana commented 1 week ago

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale