dynup / kpatch

kpatch - live kernel patching
GNU General Public License v2.0
1.48k stars 305 forks source link

kpatch module build could fail when kernel 5.19+ contains dynamic symbols #1284

Open sumanthkorikkar opened 2 years ago

sumanthkorikkar commented 2 years ago

Hi All,

when building kpatch module for 5.19+ kernel with -ffunction-sections, the vmlinux build could fail during link stage.

Reason: s390 kernel is built with -fPIE and for kpatch purpose built with ARCH_KCFLAGS "-ffunction-sections -fdata-sections"

Output: ld: .tmp_vmlinux.btf: too many sections: 65614 (>= 65280) ld: final link failed: nonrepresentable section on output BTF .btf.vmlinux.bin.o

In this scenario:

  1. gABI doesn't support dynamic symbols in output sections beyond 64k. Ref: binutils : check_dynsym (bfd abfd, Elf_Internal_Sym sym) https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=bfd/elflink.c;h=2b1450fa4e146936ba4fd6d02691a863f26a88b6;hb=HEAD#l10183

  2. s390 kernel readelf --dyn-syms vmlinux | wc 1556

  3. x86 kernel doesn't seems to have dynamic symbols and hence does not create this problem. readelf --dyn-syms vmlinux | wc -l 0

Possible fix:

  1. Provide the explicit TARGETS eg: TARGETS="fs/proc/" KPATCHBUILD_OPTS="-v $vmlinux -s $linux_src -d" ./kpatch-test rhel-9.0/data-new.patch

  2. Change linker script like:

    diff --git a/arch/s390/kernel/vmlinux.lds.S b/arch/s390/kernel/vmlinux.lds.S
    index 2e526f11b91e..1d3d2d878acb 100644
    --- a/arch/s390/kernel/vmlinux.lds.S
    +++ b/arch/s390/kernel/vmlinux.lds.S
    @@ -48,7 +48,7 @@ SECTIONS
                IRQENTRY_TEXT
                SOFTIRQENTRY_TEXT
                FTRACE_HOTPATCH_TRAMPOLINES_TEXT
    -               *(.text.*_indirect_*)
    +               *(.text.*)
                *(.gnu.warning)
                . = ALIGN(PAGE_SIZE);
                _etext = .;             /* End of text section */
  3. Create custom target in kernel top Makefile. This target would build only kernel objects without linking vmlinux target.

Question: Could you please provide me suggestions, how this could be handled better in kpatch?

Thank you

Best Regards Sumanth

joe-lawrence commented 2 years ago

Hi @sumanthkorikkar, thanks for the detailed report.

Do you happen you know why s390 arch uses dynamic symbols while x86 does not?

Also, I thought (at one time) that kernel LTO efforts leveraged -ffunction-sections as well. I wonder if that project would eventually hit this limitation as well, assuming they are working with an arch that uses dynamic symbols. Perhaps they would be fellow travelers in this space and interested in supporting dynamic symbols in output sections beyond 64k.

As for possible workarounds, a few questions and ideas on proposed solutions:

  1. Provide the explicit TARGETS eg: TARGETS="fs/proc/" KPATCHBUILD_OPTS="-v $vmlinux -s $linux_src -d" ./kpatch-test rhel-9.0/data-new.patch

This doesn't seem ideal as the user may not know exactly which target directories need to be rebuilt (ie, kpatch-build is doing that work for us).

  1. Change linker script [ ... filter (.text.indirect*) sections ... ]

I assume this would help as we recently added the external expoline requirement for kpatch? And then does it only buy us only a few less dynamic symbols?

  1. Create custom target in kernel top Makefile. This target would build only kernel objects without linking vmlinux target.

Well, we already slightly modify link-vmlinux.sh and Makefile.modfinal so this idea is not without precedent.

Maybe by modifying a recent top level Makefile like (untested):

diff --git a/Makefile b/Makefile
index 00fd80c5dd6e..6a9afdc3ee73 100644
--- a/Makefile
+++ b/Makefile
@@ -1844,6 +1844,9 @@ $(build-dirs): prepare
        single-build=$(if $(filter-out $@/, $(filter $@/%, $(KBUILD_SINGLE_TARGETS))),1) \
        need-builtin=1 need-modorder=1

+.PHONY: kpatch
+kpatch: $(build-dirs)
+
 clean-dirs := $(addprefix _clean_, $(clean-dirs))
 PHONY += $(clean-dirs) clean
 $(clean-dirs):

though adding anything as specific as that runs into code drift maintenance. (I can already see that build-dirs is relatively new and missing from older kernels.) Alternatively, I think the same is achievable by building the kernel with make */

In any case, we'd lose the ability to specify the targets on the kpatch-build command line.

jpoimboe commented 2 years ago

Do you happen you know why s390 arch uses dynamic symbols while x86 does not?

I have the same question. There will probably be other features in the future which rely on -ffunction-sections, so if there's some way for the s390 kernel to avoid using dynamic symbols then that might be the best way to "fix" the issue.

sumanthkorikkar commented 2 years ago

Hi Joe, Josh,

Do you happen you know why s390 arch uses dynamic symbols while x86 does not?

Discussed this with the compiler team.

x86 kernel:

s390 kernel:

  1. Change linker script [ ... filter _(.text._indirect*) sections ... ]

I assume this would help as we recently added the external expoline requirement for kpatch? And then does it only buy us only a few less dynamic symbols?

With -ffunction-sections, each function would its own .text section. However, As per my understanding the vmlinux which is created during kpatch build process does not matter. Individual object files would still have separate text section for each function and kpatch build deals with only those. Hence, combining all the .text sections during linking stage eliminates the ld: .tmp_vmlinux.btf: too many sections: 65614 (>= 65280) alltogether. This could be one possible approach (A quick fix).

Let me know your thoughts.

  1. Create custom target in kernel top Makefile. This target would build only kernel objects without linking vmlinux target.

Well, we already slightly modify link-vmlinux.sh and Makefile.modfinal so this idea is not without precedent.

Maybe by modifying a recent top level Makefile like (untested):

diff --git a/Makefile b/Makefile
index 00fd80c5dd6e..6a9afdc3ee73 100644
--- a/Makefile
+++ b/Makefile
@@ -1844,6 +1844,9 @@ $(build-dirs): prepare
        single-build=$(if $(filter-out $@/, $(filter $@/%, $(KBUILD_SINGLE_TARGETS))),1) \
        need-builtin=1 need-modorder=1

+.PHONY: kpatch
+kpatch: $(build-dirs)
+
 clean-dirs := $(addprefix _clean_, $(clean-dirs))
 PHONY += $(clean-dirs) clean
 $(clean-dirs):

though adding anything as specific as that runs into code drift maintenance. (I can already see that build-dirs is relatively new and missing from older kernels.) Alternatively, I think the same is achievable by building the kernel with make */

In any case, we'd lose the ability to specify the targets on the kpatch-build command line.

I tried this patch and this works in normal scenario. However, module.patch failed, because it couldn't identify the nfsd/export.o (module) and only identified (af_netlink.o) kpatch_string as new function. Will check further.

Thanks

jpoimboe commented 2 years ago

With -ffunction-sections, each function would its own .text section. However, As per my understanding the vmlinux which is created during kpatch build process does not matter. Individual object files would still have separate text section for each function and kpatch build deals with only those. Hence, combining all the .text sections during linking stage eliminates the ld: .tmp_vmlinux.btf: too many sections: 65614 (>= 65280) alltogether. This could be one possible approach (A quick fix).

Let me know your thoughts.

This may not be a good long term solution. The kernel is moving towards enabling LTO, in which case kpatch-build will have to analyze vmlinux.o rather than individual translation units.

x86 also has recently added IBT, for which kpatch-build might also need to analyze vmlinux.o (not sure about this one yet).

Also, there are other features which use -ffunction-sections (fgkaslr, as one example).

So the s390 kernel needs to figure out a way to support >64k sections.

jpoimboe commented 2 years ago

Would it be possible for s390 to use --emit-relocs?

sumanthkorikkar commented 2 years ago

Hi Josh, Joe

Thank you for the inputs.

Agree. we would definitely like to have emit-relocs or similar support for s390 kernel in long term. But this might take a while to support based on the complexities.

As a short term solution for s390 kpatch, Hence, It would be necessary to provide either explicit TARGETS or making this change in the linker script.

jpoimboe commented 2 years ago

@sumanthkorikkar

After looking at how x86 does it, converting s390 to --emit-relocs actually seems pretty straightforward. I made the following patch, it booted successfully with CONFIG_RANDOMIZE_BASE. I'll try to give it some more testing and post upstream.

s390-reloc.patch.txt

jpoimboe commented 2 years ago

Hm, I just spotted an obvious bug in handle_relocs(), not sure how it's booting ;-)

EDIT: oops, accidentally tested the wrong kernel! Anyway the patch is rough, but you get the idea.

jpoimboe commented 2 years ago

Here's a working version of the patch. I haven't tested it with 64k+ symbols and kpatch yet.

s390-reloc.patch.txt .

sumanthkorikkar commented 2 years ago

Hi Josh, Thank you for the patch

Few things:

I am yet to understand, if other rela types (Other than R_390_64) needs offset adjustment if any.

Also, I will be on vacation for next 4 weeks.

jpoimboe commented 2 years ago
  • Option -mno-pic-data-is-text-relative would generate R_390_GOTENT. This should be handled in do_reloc().

But that option is only used for the livepatch, for which do_reloc() doesn't run. Instead the module relocation code runs (apply_relocate_add() in arch/s390/kernel/module.c. So I don't see the need for R_390_GOTENT in do_reloc().

  • Greater than 64k output sections works, as no dynamic symbols are present.

  • ARCH_KFLAGS+="-fPIC" should be added to s390 kpatch tools, As -mno-pic-data-is-text-relative can be used only with -fPIC. kpatch seems to work with these.

Yes, I discovered that as well. In kpatch-build, ARCH_KCFLAGS needs -fPIC added (along with the existing -mno-pic-data-is-text-relative) to force the use of R_390_GOTENT for text accesses to global data.

sumanthkorikkar commented 1 year ago

Hi @jpoimboe

I rebased your changes and tried testing it on v6.2. It looks promising to me. Could you please send these changes across to the s390 mailing list for maintainers review.

Thanks a lot.

joe-lawrence commented 1 year ago

Hi @jpoimboe , @sumanthkorikkar , we just hit this while rebasing the integration tests to v6.3. Shall we retry with the patch from Josh's Aug 25 comment or has their been any alternate solutions explored on the s390 mailing list? Thanks.

sumanthkorikkar commented 1 year ago

Hi Joe, Josh,

I tried Josh Poimboeuf patch series on latest branch and added minor fixup on it. It is currently under internal review. Will send the rebased Josh patch series to you both soon for your valuable feedback. Thank you Josh, Joe.

joe-lawrence commented 1 year ago

Hi @sumanthkorikkar if you have a WIP, rebased version of the patch for 6.4 would you mind attaching here.. we can throw it into our internal tests at least to give it some runtime and maybe find subsequent kpatch-build issues for s390x. Thanks.

sumanthkorikkar commented 1 year ago

Hi Joe, Josh,

Attached rebased Josh-Poimboeuf patch series (master rebase) with fixup. Rebased it to master from the following source: https://git.kernel.org/pub/scm/linux/kernel/git/jpoimboe/linux.git/log/?h=s390 Seems to work for gcc. clang has few concerns which is under discussion.

Let me know, if this patch works for you. Josh-Poimboeuf_series_emit_relocs_rebase_fixup.patch.txt

Thank you Joe & Josh

github-actions[bot] commented 1 year ago

This issue has been open for 30 days with no activity and no assignee. It will be closed in 7 days unless a comment is added.

sumanthkorikkar commented 1 year ago

In progress.

github-actions[bot] commented 1 year ago

This issue has been open for 30 days with no activity and no assignee. It will be closed in 7 days unless a comment is added.

jpoimboe commented 1 year ago

@sumanthkorikkar sorry for not being communicative on this issue, it has been a busy time. I will also be out for another two weeks, feel free to keep pinging me after that :-)

sumanthkorikkar commented 1 year ago

ok, Thanks Josh. Will do so.