dynup / kpatch

kpatch - live kernel patching
GNU General Public License v2.0
1.49k stars 305 forks source link

Linux 6.1 LTS: livepatch module fails to load #1348

Closed snajpa closed 1 year ago

snajpa commented 1 year ago

Hello,

I'm trying to use kpatch-build with upstream live patching to build a live patch, but no matter how simple the patch is (tried multiple variants that worked before with 5.10) and am still getting very similar messages in dmesg:

Kernel is being built with BTF enabled (the only major difference between the .config of 5.10 which worked and 6.1 which now doesn't, besides perhaps a newer GNU toolchain)

[   26.398877] module: x86/modules: Skipping invalid relocation target, existing value is nonzero for type 1, loc 000000001d522ef0, val ffffffffc14531a9

I'm happy to provide more info / invest way more time in making it work, would appreciate further pointers where to look for answers, no idea where to start though :)

Thank you!

.config in question

joe-lawrence commented 1 year ago

Hi @snajpa, did you happen to unload / reload kernel modules (including livepatches) during your attempts? Older kernels pre 0c05e7bd2d01 ("livepatch,x86: Clear relocation targets on a module removal") are particular vulnerable to this limitation.

snajpa commented 1 year ago

thank you for your reply @joe-lawrence

it was on a clean boot in qemu, modules were only loaded in by udev, no unloaded modules that I'm aware of at the time of loading the patch module;

interestingly I've now finished a compile run with a simple patch for 6.1 without BTF and that seems to load & work properly (however I'd still like to patch our running systems with BTF on), I'm now waiting for one more run to finish with the original patch I had in mind to confirm it's really BTF related (changes uname, code line numbers changes in debug prints, etc.)

EDIT: so the full patch loads too when DEBUG_INFO_BTF=n

EDIT2: applied 0c05e7bd2d01 to our 6.1 to test whether this is it

snajpa commented 1 year ago

So after applying 0c05e7bd2d01 and then rebuilding with DEBUG_INFO_BTF=y it's about the same as without the patch:

[   24.010380] module: x86/modules: Invalid relocation target, existing value is nonzero for type 1, loc 00000000c46c84d6, val ffffffffc128e1a9
joe-lawrence commented 1 year ago

(Sorry for belated response, it's a been a busy week) I think CONFIG_DEBUG_INFO_BTF may be a red herring, or at least not the lone cause, as this is enabled on RHEL kernels. If I wanted to try to reproduce with your .config, would that have enough drivers to boot a kvm instance?

snajpa commented 1 year ago

(Sorry for belated response, it's a been a busy week)

Oh, no problem at all, my last few weeks have been crazy too + this is still a FOSS project, best effort is relative :D

If I wanted to try to reproduce with your .config, would that have enough drivers to boot a kvm instance?

Yup it's from a testing QEMU instance

joe-lawrence commented 1 year ago

Unfortunately I can't seem to build a kernel with your .config without incurring a bunch of certificate problems. (This is probably just my RHEL setup.)

Anyway, it might be interesting to build additional debugging for arch/x86/kernel/module.c:

#if 0                                                      // <<< just flip this to 1
#define DEBUGP(fmt, ...)                                \
        printk(KERN_DEBUG fmt, ##__VA_ARGS__)
#else
#define DEBUGP(fmt, ...)                                \
do {                                                    \
        if (0)                                          \
                printk(KERN_DEBUG fmt, ##__VA_ARGS__);  \
} while (0)
#endif

I could then recreate a similar scenario by reloading a module that is a klp-relocation target for a loaded livepatch module (this was the scenario fixed by 0c05e7bd2d01 ("livepatch,x86: Clear relocation targets on a module removal")):

[  224.066026] module: x86/modules: Skipping invalid relocation target, existing value is nonzero for type 11, loc 0000000060c7e7a7, val ffffffffc09331b0
                                                                                                                                         ^^^^^^^^^^^^^^^^
[  224.081137] livepatch: failed to initialize patch 'livepatch_test' for module 'joydev' (-8)
[  224.090462] livepatch: patch 'livepatch_test' failed for module 'joydev', refusing to load module 'joydev'

Using the "val " value, find the corresponding DEBUGP info:

[  224.066022] Applying relocate section 72 to 12
                                         ^^    ^^
[  224.066025] type 11 st_value ffffffffc09331b0 r_addend 0 loc ffffffffc0db362b
                                                                ^^^^^^^^^^^^^^^^

Using the DEBUGP "section to " values, grep out of readelf sections:

$ readelf --wide --sections livepatch-test.ko | awk '$1=="[Nr]" || $1=="[72]" || $1=="[12]"' | column -t
[Nr]  Name                                   Type      Address           Off     Size    ES  Flg  Lk  Inf  Al
[12]  .text.joydev_connect                   PROGBITS  0000000000000000  000e20  000356  00  AX   0   0    16
[72]  .klp.rela.joydev..text.joydev_connect  RELA      0000000000000000  087fb8  000048  18  AIo  69  12   8

Using the DEBUGP "loc " value, find the closest symbol using the crash utility:

crash> hex
output radix: 16 (hex)
crash> sym ffffffffc0db362b
ffffffffc0db362b (t) joydev_connect+0x29b [livepatch_test]

I could then disassemble joydev_connect and using the relocation offsets in .klp.rela.joydev..text.joydev_connect see what is actually in the place the module loader is complaining about. In my reproducer case, it's expecting to find the standard NULL or 0'd values at the relocation target addresses. The module loader sees that it's not zero and emits the complaint I listed above.

Your case might be different, so I stopped here in case disassembly isn't relevent and maybe just figuring out exactly which relocations are halting your livepatch module load is good enough.

It would also be interesting to inspect the relocations in one of problematic kpatch.ko files that you generated. Could you attach one here? Thanks.

snajpa commented 1 year ago

a fresh build without the 0c05e7bd2d01 patch: dmesg + readelf + livepatch_1.ko - though that module doesn't seem to have debug symbols in - how do I change that? EDIT: that might be due to automagic in nixpkgs, retrying now with dontStrip = true;

I also tried building the crash utility but the build process seems complicated to be easily ported to NixOS (relies on being able to download stuff - like gdb, would have to work around that first)

snajpa commented 1 year ago

A new run with donStrip = true in the livepatch build derivation, so with symbols now: output paste + patch module

snajpa commented 1 year ago

link to the patch I'm trying to build - but even more trivial ones don't work

EDIT: vpsadminos.c holds some unmergeable ugliness

otherwise our changes vs. vanilla are around improving/enabling OS-level containers much in OpenVZ-like fashion, so there's syslog namespace and a few hacks so lxcfs isn't needed that much (slowly working to hack it all into kernel directly) + few first-level userns special treatment cases, etc., but we don't touch any live-patching/module loading/unloading code

joe-lawrence commented 1 year ago

Thanks for the updates @snajpa, I'll have to take some time looking through these :) Perhaps a stupid question, but do any out-of-tree modules load successfully under this kernel? From your dmesg + readelf log:

[ 1307.376434] Applying relocate section 105 to 104
[ 1307.376435] type 1 st_value ffffffffc0cd1000 r_addend 0 loc ffffffffc16ee278
[ 1307.376436] type 1 st_value ffffffffc16ea1a9 r_addend 0 loc ffffffffc16ee490
[ 1307.376437] module: x86/modules: Skipping invalid relocation target, existing value is nonzero for type 1, loc 00000000feb2c90a, val ffffffffc16ea1a9

[nix-shell:~]# readelf --wide --sections /nix/store/xiamzl1c35igcp73n6cn6kgj60xxs1qr-livepatch_1-6.1.37/lib/modules/6.1.37/extra/livepatch_1.ko  | awk '$1=="[Nr]" || $1=="[105]" || $1=="[104]"' | column -t
[Nr]   Name                            Type      Address           Off     Size    ES  Flg  Lk   Inf  Al
[104]  .gnu.linkonce.this_module       PROGBITS  0000000000000000  002080  000380  00  WA   0    0    64
[105]  .rela.gnu.linkonce.this_module  RELA      0000000000000000  004c30  000030  18  I    114  104  8

it seems very odd that the kernel doesn't like the .rela.gnu.linkonce.this_module relocation ... that seems pretty common to any kernel module and not necessarily livepatch specific.

snajpa commented 1 year ago

do any out-of-tree modules load successfully under this kernel?

yup, all of the OpenZFS modules load correctly

snajpa commented 1 year ago

though it has to be said that I'm providing the OpenZFS build all the generated kernel outputs that usually live in /lib/modules/$kernel_version/{build,source} and for kpatch-build I'm using a freshly unpacked source, only reusing the .config from the running kernel - maybe the difference could arise when BTF is enabled that I'm not providing the build dir to the module build process? Though I'm not clear on what could it be from there that would be unique to each build...

snajpa commented 1 year ago

btw this is the code which builds the live-patch module; it first copies kpatch-build locally so that it can write to paths it is used writing to (/nix/store where the package lives normally is immutable), then it unpacks the kernel tarball, figures out what is the unpacked dirname into the $sourceRoot variable and then it copies the .config from the running kernel in there, before calling kpatch-build at the end of this buildPhase section

joe-lawrence commented 1 year ago

This might be a long shot, but I find that the .gnu.linkonce.this_module section is added by scripts/mod/modpost.c::add_header(). In the case of my kpatch, if I build examples/cmdline-string.patch with --debug, it will leave the intermediate .mod.c file at ~/.kpatch/tmp/patch/livepatch-cmdline-string.mod.c and it includes:

__visible struct module __this_module
__section(".gnu.linkonce.this_module") = {
        .name = KBUILD_MODNAME,
        .init = init_module,
#ifdef CONFIG_MODULE_UNLOAD
        .exit = cleanup_module,
#endif  
        .arch = MODULE_ARCH_INIT,
};

and here we see why there are relocations generated for the init_module and cleanup_module symbols for this section.

From your livepatch_1.ko:

$ readelf --wide --relocs livepatch_1.ko | awk '/.rela.gnu.linkonce.this_module/' RS="\n\n"
Relocation section '.rela.gnu.linkonce.this_module' at offset 0x4c30 contains 2 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000138  0000003a00000001 R_X86_64_64            0000000000000000 init_module + 0
0000000000000350  0000003500000001 R_X86_64_64            0000000000000000 cleanup_module + 0
             ^^^

I don't know how you might create, build and load such a mismatch, but maybe it's possible that the offset into the struct module for the .exit member is 0x350 for livepatch_1.ko, but something else for the running kernel? And that the module loader has already saved something into struct module offset 0x350 rather than leaving for the relocation to fill.

I think we could confirm this by running a similar readelf command on one of the module .kos that does successfully load, like:

# Where is .exit offset for a loadable kernel driver?
$ readelf --wide --relocs drivers/input/joydev.ko | awk '/.rela.gnu.linkonce.this_module/' RS="\n\n"
Relocation section '.rela.gnu.linkonce.this_module' at offset 0x39ae8 contains 2 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000138  0000007200000001 R_X86_64_64            0000000000000000 init_module + 0
0000000000000360  0000006900000001 R_X86_64_64            0000000000000000 cleanup_module + 0
             ^^^

# Where is .exit offset for a kpatch.ko?
$ readelf --wide --relocs ~/kpatch/livepatch-cmdline-string.ko | awk '/.rela.gnu.linkonce.this_module/' RS="\n\n"
Relocation section '.rela.gnu.linkonce.this_module' at offset 0x2240 contains 2 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000138  0000005700000001 R_X86_64_64            0000000000000000 init_module + 0
0000000000000360  0000005200000001 R_X86_64_64            0000000000000000 cleanup_module + 0
             ^^^

Hopefully those offsets are the same and that the kernel has a properly sized struct module to work with.

BTW I notice that CONFIG_MODVERSIONS is not set in your config. That could be an additional protection against this potential cause.

snajpa commented 1 year ago

hmm, joydev.ko seems to have cleanup_module @ 0x360:

[nix-shell:~]# readelf --wide --relocs joydev.ko | awk '/.rela.gnu.linkonce.this_module/' RS="\n\n"
Relocation section '.rela.gnu.linkonce.this_module' at offset 0x7fe0 contains 2 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000138  0000004d00000001 R_X86_64_64            0000000000000000 init_module + 0
0000000000000360  0000004400000001 R_X86_64_64            0000000000000000 cleanup_module + 0

[nix-shell:~]# readelf --wide --relocs /nix/store/y9xpiyq88bz9z8krwixi5wgdiqy3q8wi-livepatch_1-6.1.37/lib/modules/6.1.37/extra/livepatch_1.ko | awk '/.rela.gnu.linkonce.this_module/' RS="\n\n"
Relocation section '.rela.gnu.linkonce.this_module' at offset 0x3e40 contains 2 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000138  0000009400000001 R_X86_64_64            0000000000000000 init_module + 0
0000000000000350  0000008f00000001 R_X86_64_64            0000000000000000 cleanup_module + 0
joe-lawrence commented 1 year ago

OK, I think I reproduced it here:

  1. Build and reboot a kernel with CONFIG_MODVERSIONS is not set and DEBUG_INFO_BTF=y
  2. Reconfigure the .config to turn off DEBUG_INFO_BTF
  3. Build a kpatch with the revised .config
  4. Loading this kpatch fails with:
    
    [ 1171.914955] Applying relocate section 42 to 41
    [ 1171.914956] type 1 st_value ffffffffc089b000 r_addend 0 loc ffffffffc087b1b8
    [ 1171.914957] type 1 st_value ffffffffc0879369 r_addend 0 loc ffffffffc087b3d0
    [ 1171.914957] module: x86/modules: Skipping invalid relocation target, existing value is nonzero for type 1, loc 00000000deba4b51, val ffffffffc0879369

$ readelf --wide --sections livepatch-cmdline-string.ko | awk '$1=="[Nr]" || $1=="[42]" || $1=="[41]"' | column -t [Nr] Name Type Address Off Size ES Flg Lk Inf Al [41] .gnu.linkonce.this_module PROGBITS 0000000000000000 001bc0 000380 00 WA 0 0 64 [42] .rela.gnu.linkonce.this_module RELA 0000000000000000 001f40 000030 18 I 61 41 8


So since your joydev.ko and livepatch_1.ko are trying to relocate the struct module exit member at different offsets, it would appear to me that there are two different kernel configurations in play that are changing the definition of the structure.
snajpa commented 1 year ago

That is not the case though, if I change anything in the kernel config, everything depending on that config rebuilds too; all the dependencies - such as OpenZFS and live-patches rebuild too since they depend on the output of the config "derivation" (in Nix terms) and that changes

So I alternate between whole worlds built with either DEBUG_INFO_BTF= yes or no (EDIT: yes or not set at all, leaving it to defaults), but it's not possible to mix both in the same build.

joe-lawrence commented 1 year ago

Apologies if you've already answered, but I'm completely out of my depth in the world of Nix..

... for kpatch-build I'm using a freshly unpacked source, only reusing the .config from the running kernel

How does this .config get fed to kpatch-build and how do we know it actually matches what the running kernel was built with?

These members will be conditionally compiled into the struct module layout depending on DEBUG_INFO_BTF:

struct module {
        ...
        int (*init)(void);
        ...
#ifdef CONFIG_DEBUG_INFO_BTF_MODULES
        unsigned int btf_data_size;
        void *btf_data;
#endif
        ...
        void (*exit)(void);

and if I repeat the repro steps from my last comment but with CONFIG_MODVERSIONS=y, the kpatch will refuse to load with: livepatch_cmdline_string: disagrees about version of symbol module_layout. Is that something you could try with the running kernel and kpatch .config?

So I alternate between whole worlds built with either DEBUG_INFO_BTF= yes or no (EDIT: yes or not set at all, leaving it to defaults), but it's not possible to mix both in the same build.

What does "whole worlds" refer to? Running kernel build and kpatch build?

Thanks for tagging along, this has been kinda interesting to debug so far :)

snajpa commented 1 year ago

if you'd like to be able to reproduce directly, here are the steps:

  1. Install Nix
  2. clone https://github.com/vpsfreecz/vpsadminos branch snajpa-kpatch-1348
  3. chdir into vpsadminos
  4. clone nixpkgs here https://github.com/nixos/nixpkgs @ commit e7603eba51f2c7820c0a182c6bbb351181caa8e7 (as nixpkgs)
  5. now you can run
    export NIX_PATH=`pwd`
    make qemu

    ...and eventually after everything compiles, you should get qemu which will eventually try to load the livepatch and fails; but it might need root for network setup (we run it in nested qemu)

What does "whole worlds" refer to? Running kernel build and kpatch build?

yes, but more generally, a tiniest change to either vpsadminos or nixpkgs generates new squashfs rootfs for a live system (it can be installed too but that's not the majority use-case)

EDIT: useful for debugging

# to get shell where gdb is available
nix-shell -p gdb
# ^ these can be stacked on top of each other with further calls such as
nix-shell -p elfutils

# attempt to load the livepatch (also to quickly find out where the module is stored)
live-patches load
snajpa commented 1 year ago

if I repeat the repro steps from my last comment but with CONFIG_MODVERSIONS=y, the kpatch will refuse to load with: livepatch_cmdline_string: disagrees about version of symbol module_layout. Is that something you could try with the running kernel and kpatch .config?

yes I now get the same, but I'm pretty sure the config used to build both the kernel and the livepatch is the same, but Module.symvers file isn't shared between the running kernel build and the livepatch build. Maybe that is the problem?

joe-lawrence commented 1 year ago

yes I now get the same, but I'm pretty sure the config used to build both the kernel and the livepatch is the same, but Module.symvers file isn't shared between the running kernel build and the livepatch build. Maybe that is the problem?

Hmm, instead of building a kpatch, could you build a simple out of tree kernel module using the same mechanism? That would give insight on whether it's a kpatch-build specific problem or the mechanism in general.

As for Module.symvers, the create-diff-object (sub)program reads the one generated by the kpatch-build reference kernel build, not the running kernel's. CONFIG_MODVERSIONS will protect the running kernel from loading a module with different symbol versions. AFAIK it's more of a build artifact than a build input, but I could be wrong.

if you'd like to be able to reproduce directly, here are the steps ...

Thanks for those, I may have to defer attempting that to another day :) I assume Nix can install on top of RHEL or Fedora without needing NixOS?

snajpa commented 1 year ago

Oh, I might actually not be using the same sources, I might be missing the randstruct seed patch, which seems to have influence on the vermagic. This definitely won't be about broken kpatch, but my implementation of it. Sorry for bothering you; I wouldn't know where to look if it wasn't for your advice & guidance. Thank you.

snajpa commented 1 year ago

I assume Nix can install on top of RHEL or Fedora without needing NixOS?

yup :)

if I get the vermagic stuff into order and it still won't fly, then I'll think it might be a bug with something outside of my reach, but I really haven't noticed this until now and this should really be it - in NixOS, they have the CONFIG_MODVERSIONS to y only for kernels older than 4.9, don't know the reason behind that

joe-lawrence commented 1 year ago

Oh, I might actually not be using the same sources, I might be missing the randstruct seed patch, which seems to have influence on the vermagic. This definitely won't be about broken kpatch, but my implementation of it. Sorry for bothering you; I wouldn't know where to look if it wasn't for your advice & guidance. Thank you.

No problem, it is often that these things don't manifest themselves in the easiest to debug ways. As it turns out, having so many config-dependent structure members between the init / exit callbacks makes the kernel's relocation sanity check a decent indicator of config drift. I'll let you close out the issue if this turns out to be the case, otherwise let us know if this one takes another interesting turn.

snajpa commented 1 year ago

LOL! what a stupid mistake... turns out, the environment for building the livepatch was missing pahole, but the environment generating the .config didn't have it either and so on the actual vmlinux + modules compile run, it modified the config according to the following diff:

--- /nix/store/b0hhxfcvj586brarvpsq4q2qhads3j93-linux-config-6.1.37     1970-01-01 00:00:01.000000000 +0000
+++ /build/linux-9e521fe2d9caf447654107798b253a384508c96e/build/.config 2023-07-20 15:17:51.583854016 +0000
@@ -16,7 +16,7 @@
 CONFIG_CC_HAS_ASM_GOTO_TIED_OUTPUT=y
 CONFIG_CC_HAS_ASM_INLINE=y
 CONFIG_CC_HAS_NO_PROFILE_FN_ATTR=y
-CONFIG_PAHOLE_VERSION=0
+CONFIG_PAHOLE_VERSION=125
 CONFIG_IRQ_WORK=y
 CONFIG_BUILDTIME_TABLE_SORT=y
 CONFIG_THREAD_INFO_IN_TASK=y
@@ -8682,6 +8682,9 @@
 # CONFIG_DEBUG_INFO_COMPRESSED is not set
 # CONFIG_DEBUG_INFO_SPLIT is not set
 CONFIG_DEBUG_INFO_BTF=y
+CONFIG_PAHOLE_HAS_SPLIT_BTF=y
+CONFIG_DEBUG_INFO_BTF_MODULES=y
+# CONFIG_MODULE_ALLOW_BTF_MISMATCH is not set
 # CONFIG_GDB_SCRIPTS is not set
 CONFIG_FRAME_WARN=2048
 # CONFIG_STRIP_ASM_SYMS is not set

... then the OpenZFS build is using the build/ and source/ outputs from the kernel compile process, which contains the modifications done to the original .config by the compile process when pahole is present in the environment; the fix is simple, ensuring that all the build environments share the same nativeBuildInputs values, don't know why I didn't refer them directly and opted to copy them over instead - wouldn't have to keep them in sync manually...

Thank you so very much for helping me to get to the bottom of this.

joe-lawrence commented 1 year ago

Oh that's sneaky. One thing you may consider amending is the kpatch source tree dependencies target. It tries to detect the running distro and then installs all the necessary packages for building the kernel and kpatch binaries. Copying the configuration as you're doing now is probably the safest bet, but I thought I'd mention it in case it simplifies any of your use cases.