dynup / kpatch

kpatch - live kernel patching
GNU General Public License v2.0
1.49k stars 305 forks source link

find_local_syms for xxx.c: found_none #706

Closed vincentbernat closed 5 years ago

vincentbernat commented 7 years ago

Hey!

I am trying to patch a 3.13 kernel and I am running into some difficulties. I get a lot of errors during symbol comparison:

/home/ubuntu/kpatch/kpatch-build/create-diff-object: ERROR: dev_ioctl.o: find_local_syms: 136: find_local_syms for dev_ioctl.c: found_none
/home/ubuntu/kpatch/kpatch-build/create-diff-object: ERROR: route.o: find_local_syms: 136: find_local_syms for route.c: found_none
/home/ubuntu/kpatch/kpatch-build/create-diff-object: ERROR: ip6_fib.o: find_local_syms: 136: find_local_syms for ip6_fib.c: found_none
/home/ubuntu/kpatch/kpatch-build/create-diff-object: ERROR: flow.o: find_local_syms: 136: find_local_syms for flow.c: found_none
/home/ubuntu/kpatch/kpatch-build/create-diff-object: ERROR: ipv6_sockglue.o: find_local_syms: 136: find_local_syms for ipv6_sockglue.c: found_none
/home/ubuntu/kpatch/kpatch-build/create-diff-object: ERROR: ndisc.o: find_local_syms: 136: find_local_syms for ndisc.c: found_none

If I use readelf -s /usr/lib/debug/boot/vmlinux-3.13.0-117-generic and look for flow.c, I see (| grep -E 'FILE|FUNC|OBJECT' | grep LOCAL):

 56947: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS flow.c
 56948: ffffffff816725b0   191 FUNC    LOCAL  DEFAULT    1 flow_cache_gc_task
 56949: ffffffff81fd85c0     4 OBJECT  LOCAL  DEFAULT   30 flow_cache_gc_lock
 56950: ffffffff81cdb5e0    16 OBJECT  LOCAL  DEFAULT   16 flow_cache_gc_list
 56951: ffffffff81d19910     8 OBJECT  LOCAL  DEFAULT   16 flow_cachep
 56952: ffffffff81672670   113 FUNC    LOCAL  DEFAULT    1 flow_cache_new_hashrnd
 56953: ffffffff816726f0    48 FUNC    LOCAL  DEFAULT    1 flow_cache_flush_per_cpu
 56954: ffffffff81672720   181 FUNC    LOCAL  DEFAULT    1 flow_cache_cpu_prepare.is
 56955: ffffffff81672840   314 FUNC    LOCAL  DEFAULT    1 flow_cache_flush_tasklet
 56956: ffffffff816727e0    95 FUNC    LOCAL  DEFAULT    1 flow_cache_queue_garbage.
 56957: ffffffff81672980   329 FUNC    LOCAL  DEFAULT    1 __flow_cache_shrink.isra.
 56958: ffffffff81672ad0   118 FUNC    LOCAL  DEFAULT    1 flow_cache_cpu
 56959: ffffffff81672b50   245 FUNC    LOCAL  DEFAULT    1 flow_hash_code.isra.4.con
 56960: ffffffff81fd8540   128 OBJECT  LOCAL  DEFAULT   30 flow_cache_global
 56961: ffffffff8189e7a0    88 OBJECT  LOCAL  DEFAULT    4 CSWTCH.63
 56962: ffffffff81d8dcc9   376 FUNC    LOCAL  DEFAULT   19 flow_cache_init.constprop
 56963: ffffffff81d8de41    45 FUNC    LOCAL  DEFAULT   19 flow_cache_init_global
 56964: ffffffff81fd85c4     4 OBJECT  LOCAL  DEFAULT   30 flow_flush_lock.27780
 56965: ffffffff816730c0    16 FUNC    LOCAL  DEFAULT    1 flow_cache_flush_task
 56966: ffffffff81cdb600    32 OBJECT  LOCAL  DEFAULT   16 flow_cache_flush_work
 56967: ffffffff81e597a0     8 OBJECT  LOCAL  DEFAULT   20 __initcall_flow_cache_ini
 56968: ffffffff81b69468    18 OBJECT  LOCAL  DEFAULT   12 __kstrtab_flow_cache_look
 56969: ffffffff81b3da70     8 OBJECT  LOCAL  DEFAULT   10 __kcrctab_flow_cache_look
 56970: ffffffff81b6947a    17 OBJECT  LOCAL  DEFAULT   12 __kstrtab_flow_cache_geni
 56971: ffffffff81b3da68     8 OBJECT  LOCAL  DEFAULT   10 __kcrctab_flow_cache_geni

If I do that on patched flow.o, I get:

    43: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS flow.c
    44: 0000000000000000   113 FUNC    LOCAL  DEFAULT    2 flow_cache_new_hashrnd
    45: 0000000000000000    48 FUNC    LOCAL  DEFAULT    4 flow_cache_flush_per_cpu
    46: 0000000000000000   181 FUNC    LOCAL  DEFAULT    6 flow_cache_cpu_prepare.is
    47: 0000000000000000   314 FUNC    LOCAL  DEFAULT   10 flow_cache_flush_tasklet
    48: 0000000000000000   263 FUNC    LOCAL  DEFAULT    8 flow_cache_queue_garbage.
    49: 0000000000000000     4 OBJECT  LOCAL  DEFAULT   51 flow_cache_gc_lock
    50: 0000000000000000    16 OBJECT  LOCAL  DEFAULT   44 flow_cache_gc_list
    51: 0000000000000000     8 OBJECT  LOCAL  DEFAULT   43 flow_cachep
    52: 0000000000000000   329 FUNC    LOCAL  DEFAULT   12 __flow_cache_shrink.isra.
    53: 0000000000000000   118 FUNC    LOCAL  DEFAULT   14 flow_cache_cpu
    54: 0000000000000000   128 OBJECT  LOCAL  DEFAULT   50 flow_cache_global
    55: 0000000000000000    88 OBJECT  LOCAL  DEFAULT   38 CSWTCH.63
    56: 0000000000000000   410 FUNC    LOCAL  DEFAULT   18 flow_cache_init_global
    57: 0000000000000000     4 OBJECT  LOCAL  DEFAULT   52 flow_flush_lock.27780
    58: 0000000000000000    16 FUNC    LOCAL  DEFAULT   22 flow_cache_flush_task
    59: 0000000000000000    32 OBJECT  LOCAL  DEFAULT   46 flow_cache_flush_work
    60: 0000000000000000     8 OBJECT  LOCAL  DEFAULT   41 __initcall_flow_cache_ini
    61: 0000000000000000    18 OBJECT  LOCAL  DEFAULT   39 __kstrtab_flow_cache_look
    62: 0000000000000000     8 OBJECT  LOCAL  DEFAULT   32 __kcrctab_flow_cache_look
    63: 0000000000000012    17 OBJECT  LOCAL  DEFAULT   39 __kstrtab_flow_cache_geni
    64: 0000000000000000     8 OBJECT  LOCAL  DEFAULT   36 __kcrctab_flow_cache_geni

So, I have a few symbols missing (flow_cache_gc_task) and a slightly different ordering.

Is the ordering important (I would say it is)? How does kpatch-build ensure the appropriate CFLAGS are used? flow_cache_gc_task is static, small with only one reference and may have been inlined in my build. Kernel is from Ubuntu.

I am using gcc 4.6 with the appropriate patch to make -ffunction-sections work.

joe-lawrence commented 7 years ago

Hi @vincentbernat,

The kpatch-build script only exports KCFLAGS="-I$DATADIR/patch -ffunction-sections -fdata-sections", the rest of the kernel flags and options should be coming from the unpacked kernel source. With that in mind, the script does a little bit of CONFIG_... checking to ensure a couple features like CONFIG_DEBUG_KERNEL are set. Could you attach the kernel .config for which you are trying to build and if possible, the .patch file as well?

As far as symbol correlation is concerned, I don't believe that ordering is an issue. kpatch-build/lookup.c has a bunch of logic, including string comparison, to try and associate original to patched symbols. However, the reported error message find_local_syms for file.c: found_none implies that it can't find any local symbols in those object files.

Can you repeat the build with kpatch-build --skip-cleanup ... and run readelf on the object files in /root/.kpatch/tmp/patched ?

Thanks!

vincentbernat commented 7 years ago

Hey!

Thanks for the fast answer. I am using this patch. And the .config: config-3.13.0-117-generic.txt.

The readelf -s output for the non-patched flow.o:

    43: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS flow.c
    44: 0000000000000000   113 FUNC    LOCAL  DEFAULT    2 flow_cache_new_hashrnd
    45: 0000000000000000    48 FUNC    LOCAL  DEFAULT    4 flow_cache_flush_per_cpu
    46: 0000000000000000   181 FUNC    LOCAL  DEFAULT    6 flow_cache_cpu_prepare.is
    47: 0000000000000000   314 FUNC    LOCAL  DEFAULT   10 flow_cache_flush_tasklet
    48: 0000000000000000   263 FUNC    LOCAL  DEFAULT    8 flow_cache_queue_garbage.
    49: 0000000000000000     4 OBJECT  LOCAL  DEFAULT   51 flow_cache_gc_lock
    50: 0000000000000000    16 OBJECT  LOCAL  DEFAULT   44 flow_cache_gc_list
    51: 0000000000000000     8 OBJECT  LOCAL  DEFAULT   43 flow_cachep
    52: 0000000000000000   329 FUNC    LOCAL  DEFAULT   12 __flow_cache_shrink.isra.
    53: 0000000000000000   118 FUNC    LOCAL  DEFAULT   14 flow_cache_cpu
    54: 0000000000000000   128 OBJECT  LOCAL  DEFAULT   50 flow_cache_global
    55: 0000000000000000    88 OBJECT  LOCAL  DEFAULT   38 CSWTCH.63
    56: 0000000000000000   410 FUNC    LOCAL  DEFAULT   18 flow_cache_init_global
    57: 0000000000000000     4 OBJECT  LOCAL  DEFAULT   52 flow_flush_lock.27780
    58: 0000000000000000    16 FUNC    LOCAL  DEFAULT   22 flow_cache_flush_task
    59: 0000000000000000    32 OBJECT  LOCAL  DEFAULT   46 flow_cache_flush_work
    60: 0000000000000000     8 OBJECT  LOCAL  DEFAULT   41 __initcall_flow_cache_ini
    61: 0000000000000000    18 OBJECT  LOCAL  DEFAULT   39 __kstrtab_flow_cache_look
    62: 0000000000000000     8 OBJECT  LOCAL  DEFAULT   32 __kcrctab_flow_cache_look
    63: 0000000000000012    17 OBJECT  LOCAL  DEFAULT   39 __kstrtab_flow_cache_geni
    64: 0000000000000000     8 OBJECT  LOCAL  DEFAULT   36 __kcrctab_flow_cache_geni

And the patched flow.o:

    43: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS flow.c
    44: 0000000000000000   113 FUNC    LOCAL  DEFAULT    2 flow_cache_new_hashrnd
    45: 0000000000000000    48 FUNC    LOCAL  DEFAULT    4 flow_cache_flush_per_cpu
    46: 0000000000000000   181 FUNC    LOCAL  DEFAULT    6 flow_cache_cpu_prepare.is
    47: 0000000000000000   314 FUNC    LOCAL  DEFAULT   10 flow_cache_flush_tasklet
    48: 0000000000000000   263 FUNC    LOCAL  DEFAULT    8 flow_cache_queue_garbage.
    49: 0000000000000000     4 OBJECT  LOCAL  DEFAULT   51 flow_cache_gc_lock
    50: 0000000000000000    16 OBJECT  LOCAL  DEFAULT   44 flow_cache_gc_list
    51: 0000000000000000     8 OBJECT  LOCAL  DEFAULT   43 flow_cachep
    52: 0000000000000000   329 FUNC    LOCAL  DEFAULT   12 __flow_cache_shrink.isra.
    53: 0000000000000000   118 FUNC    LOCAL  DEFAULT   14 flow_cache_cpu
    54: 0000000000000000   128 OBJECT  LOCAL  DEFAULT   50 flow_cache_global
    55: 0000000000000000    88 OBJECT  LOCAL  DEFAULT   38 CSWTCH.63
    56: 0000000000000000   410 FUNC    LOCAL  DEFAULT   18 flow_cache_init_global
    57: 0000000000000000     4 OBJECT  LOCAL  DEFAULT   52 flow_flush_lock.27780
    58: 0000000000000000    16 FUNC    LOCAL  DEFAULT   22 flow_cache_flush_task
    59: 0000000000000000    32 OBJECT  LOCAL  DEFAULT   46 flow_cache_flush_work
    60: 0000000000000000     8 OBJECT  LOCAL  DEFAULT   41 __initcall_flow_cache_ini
    61: 0000000000000000    18 OBJECT  LOCAL  DEFAULT   39 __kstrtab_flow_cache_look
    62: 0000000000000000     8 OBJECT  LOCAL  DEFAULT   32 __kcrctab_flow_cache_look
    63: 0000000000000012    17 OBJECT  LOCAL  DEFAULT   39 __kstrtab_flow_cache_geni
    64: 0000000000000000     8 OBJECT  LOCAL  DEFAULT   36 __kcrctab_flow_cache_geni

No diff. Even without grep.

Maybe my libelf is too old (0.152-1ubuntu3.1). I can try to walk find_local_syms() to spot the problem.

vincentbernat commented 7 years ago

OK, I have walked a bit with gdb and here is what happens. find_local_syms is called with the symbol table from /usr/lib/debug/boot/vmlinux-3.13.0-117-generic (so, it doesn't match what's compiled). It finds the STT_FILE flow.c. Then, the first LOCAL, FUNC symbol it finds is flow_cache_gc_task. It compares it to the first symbol in child_locals which is flow_cache_new_hashrnd. It doesn't match, it considers that it is not in flow.c (in_file=0) anymore and continue without any more success (as there is no more STT_FILE flow.c), so no symbol found.

So:

I am using the same compiler than Ubuntu used to compile the package. I suppose there is some difference in the options. Or using -ffunction-sections or -fdata-sections make a difference in the generated code with my version of gcc (which is quite old).

vincentbernat commented 7 years ago
Breakpoint 2, find_local_syms (table=0x6b2a80, hint=0x7ffff7fef061 "flow.c", child_locals=0x6b2910)
    at lookup.c:100
100                                     file_sym = sym;
2: *sym = {value = 0, size = 0, name = 0x8c68e0 "flow.c", type = 4, bind = 0, skip = 0}
1: sym = (struct object_symbol *) 0x7fffed2ab208
123                         !strcmp(child_sym->name, sym->name))
2: *child_sym = {name = 0x7ffff7fef068 "flow_cache_new_hashrnd", type = 2}
1: *sym = {value = 18446744071585605040, size = 191, name = 0x8c6900 "flow_cache_gc_task", type = 2,
  bind = 0, skip = 0}
126                             in_file = 0;
2: *child_sym = {name = 0x7ffff7fef068 "flow_cache_new_hashrnd", type = 2}
1: *sym = {value = 18446744071585605040, size = 191, name = 0x8c6900 "flow_cache_gc_task", type = 2,
  bind = 0, skip = 0}
joe-lawrence commented 7 years ago

Trying to repro here on RHEL7 and it looks like I hit a different build program with that patch:

/usr/local/libexec/kpatch/create-diff-object: ERROR: mpls_gso.o: find_local_syms: 136: find_local_syms for mpls_gso.c: found_none

I hope to take a better look after lunch :)

vincentbernat commented 7 years ago

If you only have this one, it's unlikely that's the same problem. I get 170+ errors like this on my build.

joe-lawrence commented 7 years ago

@vincentbernat -- I think I've chased down my build bug to an unrelated matter. As far as yours is concerned, after running through find_local_syms in gdb myself, it looks like its trying to compare the reference kernel-object with a rebuilt, unpatched object file. And I think you're correct in that it wants to see similar symbol ordering of both tables.

On my RHEL7 setup, I compared this file's entries from my vmlinux file:

readelf --wide -s $VMLINUX | awk '/FILE/{out=0} /FILE.*flow\.c/{out=1} !/GLOBAL/{ if(out) print $0 }'

to the ones generated in ~/.kpatch/tmp/orig and they were indeed the same. How different do the entries for flow.c look in your reference vmlinux file?

I wouldn't think that a gcc patch to implement -ffunction-sections would change these values, but without knowing much about gcc, it's possible. What happens if you rebuild the kernel source rpm with the patched gcc and attempt to kpatch against that one?

vincentbernat commented 7 years ago

The GCC patch is this one, so it's not really a patch to implement anything. The output of readelf -s for the original vmlinux is in the original comment.

I'll rebuild the source package and I'll check if there is some special flags provided during the build.

joe-lawrence commented 7 years ago

The output of readelf -s for the original vmlinux is in the original comment.

By "reference vmlinux", I meant /usr/lib/debug/boot/vmlinux-3.13.0-117-generic. That's the one that find_local_syms is comparing against, right? Just curious to see how different it is.

Can you edit/annotate the readelf examples in the previous comments to indicate which vmlinux and which flow.o is being listed? (Full command-line or paths would be sufficient.)

If I understand the bug report correctly, the ordering of the symbols for flow.c is different in /usr/lib/debug/boot/vmlinux-3.13.0-117-generic than it is when kpatch-build recreates its original, unpatched build (only adding the $KCFLAGS listed above.) The symbol ordering is however, consistent between kpatch-build original and patched versions of flow.o. Does that sound correct?

vincentbernat commented 7 years ago

Recompiled and patched flow.o have the same symbols in the same order. Only /usr/lib/debug/boot/vmlinux doesn't match. I'll update the other comments like you suggested.

joe-lawrence commented 7 years ago

@vincentbernat - in the first comment, is the output from readelf -s /usr/lib/debug/boot/vmlinux-3.13.0-117-generic correct? (I don't see the flow_cache_gc_task symbol that you reported as missing. If I understand the problem, isn't the reference flow.c FILE vmlinux symbol table unique to its counterparts found in the kpatch-built flow.o and patched flow.o?)

As for next steps, I might have a look at why the compiler is generating different symbol tables. Is it possible that vmlinux-3.13.0-117-generic was built by a distro-specific gcc? (ie, includes other changes that your patched gcc doesn't.) @jpoimboe @flaming-toast any other ideas?

vincentbernat commented 7 years ago

@joe-lawrence oh, it seems I have messed the copy paste. I have corrected it with the actual output.

As for the gcc used, the version is present in dmesg and matches mine. I have taken the same sources (with apt-get source) and added the patch I mentioned and recompiled. I didn't get time yet to just try to recompile the kernel package without any change and check what the debug vmlinux looks like. Feel free to not investigate too much in the mean time.

jpoimboe commented 7 years ago

The only thing I can think of is that the package build might be adding compiler flags on the command line beyond what the makefiles do.

vincentbernat commented 7 years ago

Here is the command-line used by kpatch:

gcc -Wp,-MD,net/core/.flow.o.d -nostdinc -isystem /usr/lib/gcc/x86_64-linux-gnu/4.6/include -I/home/ubuntu/.kpatch/src/arch/x86/include -Iarch/x86/include/generated -Iinclude -I/home/ubuntu/.kpatch/src/arch/x86/include/uapi -Iarch/x86/include/generated/uapi -I/home/ubuntu/.kpatch/src/include/uapi -Iinclude/generated/uapi -include /home/ubuntu/.kpatch/src/include/linux/kconfig.h -Iubuntu/include -D__KERNEL__ -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Werror-implicit-function-declaration -Wno-format-security -fno-delete-null-pointer-checks -std=gnu89 -O2 -m64 -mno-mmx -mno-sse -mtune=generic -mno-red-zone -mcmodel=kernel -funit-at-a-time -maccumulate-outgoing-args -DCONFIG_X86_X32_ABI -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -DCONFIG_AS_CFI_SECTIONS=1 -DCONFIG_AS_FXSAVEQ=1 -DCONFIG_AS_AVX=1 -DCONFIG_AS_AVX2=1 -pipe -Wno-sign-compare -fno-asynchronous-unwind-tables -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -mno-avx -Wframe-larger-than=1024 -fstack-protector -Wno-unused-but-set-variable -fno-omit-frame-pointer -fno-optimize-sibling-calls -fno-var-tracking-assignments -g -pg -mfentry -DCC_USING_FENTRY -Wdeclaration-after-statement -Wno-pointer-sign -fno-strict-overflow -fconserve-stack -Werror=implicit-int -Werror=strict-prototypes -DCC_HAVE_ASM_GOTO -I/home/ubuntu/kpatch/kmod/patch -ffunction-sections -fdata-sections '-DKBUILD_STR(s)=#s' '-DKBUILD_BASENAME=KBUILD_STR(flow)' '-DKBUILD_MODNAME=KBUILD_STR(flow)' -c -o net/core/.tmp_flow.o net/core/flow.c

And the command-line used during package build:

gcc -Wp,-MD,net/core/.flow.o.d  -nostdinc -isystem /usr/lib/gcc/x86_64-linux-gnu/4.6/include -I/usr/src/linux-headers-lbm- -I/home/ubuntu/linux2/arch/x86/include -Iarch/x86/include/generated  -I/home/ubuntu/linux2/include -Iinclude -I/home/ubuntu/linux2/arch/x86/include/uapi -Iarch/x86/include/generated/uapi -I/home/ubuntu/linux2/include/uapi -Iinclude/generated/uapi -include /home/ubuntu/linux2/include/linux/kconfig.h -Iubuntu/include -I/home/ubuntu/linux2/ubuntu/include  -I/home/ubuntu/linux2/net/core -Inet/core -D__KERNEL__ -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Werror-implicit-function-declaration -Wno-format-security -fno-delete-null-pointer-checks -std=gnu89 -O2 -m64 -mno-mmx -mno-sse -mtune=generic -mno-red-zone -mcmodel=kernel -funit-at-a-time -maccumulate-outgoing-args -DCONFIG_X86_X32_ABI -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -DCONFIG_AS_CFI_SECTIONS=1 -DCONFIG_AS_FXSAVEQ=1 -DCONFIG_AS_AVX=1-DCONFIG_AS_AVX2=1 -pipe -Wno-sign-compare -fno-asynchronous-unwind-tables -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -mno-avx -Wframe-larger-than=1024 -fstack-protector -Wno-unused-but-set-variable -fno-omit-frame-pointer -fno-optimize-sibling-calls -fno-var-tracking-assignments -g -pg -mfentry -DCC_USING_FENTRY -fno-inline-functions-called-once -Wdeclaration-after-statement -Wno-pointer-sign -fno-strict-overflow -fconserve-stack -Werror=implicit-int -Werror=strict-prototypes -DCC_HAVE_ASM_GOTO    -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(flow)"  -D"KBUILD_MODNAME=KBUILD_STR(flow)" -c -o net/core/.tmp_flow.o /home/ubuntu/linux2/net/core/flow.c

The difference is -fno-inline-functions-called-once added when doing the package build (and -ffunction-sections and -fdata-sections added by kpatch). This should be added only if CONFIG_DEBUG_SECTION_MISMATCH is set and while it is unset in .config, it is forcibly set by the make command:

make ARCH=x86_64 CROSS_COMPILE= KERNELVERSION=3.13.0-117-generic CONFIG_DEBUG_SECTION_MISMATCH=y KBUILD_BUILD_VERSION="164~precise1" LOCALVERSION= localver-extra= CFLAGS_MODULE="-DPKG_ABI=117" O=/home/ubuntu/linux2/debian/build/build-generic -j8 bzImage modules

What's the best way to ensure kpatch also compiles the kernel with the same command line?

jpoimboe commented 7 years ago

My advice would be to patch the makefile to add -fno-inline-functions-called-once instead of adding the option manually on the command line.

vincentbernat commented 7 years ago

So, I have made some progress but I still have some modules failing.

From readelf -s /usr/lib/debug/.../vmlinux:

 59720: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS tcp_cubic.c
 59721: ffffffff816e5af0   106 FUNC    LOCAL  DEFAULT    1 bictcp_recalc_ssthresh
 59722: ffffffff81d1ba10     4 OBJECT  LOCAL  DEFAULT   16 fast_convergence
 59723: ffffffff81d1ba14     4 OBJECT  LOCAL  DEFAULT   16 beta
 59724: ffffffff816e5b60    30 FUNC    LOCAL  DEFAULT    1 bictcp_undo_cwnd
 59725: ffffffff81d914e7   106 FUNC    LOCAL  DEFAULT   19 cubictcp_register
 59726: ffffffff81d1ba34     4 OBJECT  LOCAL  DEFAULT   16 bic_scale
 59727: ffffffff81d1ba28     4 OBJECT  LOCAL  DEFAULT   16 cube_rtt_scale
 59728: ffffffff81d1ba30     4 OBJECT  LOCAL  DEFAULT   16 beta_scale
 59729: ffffffff81d1ba00     4 OBJECT  LOCAL  DEFAULT   16 hystart
 59730: ffffffff81d1ba20     8 OBJECT  LOCAL  DEFAULT   16 cube_factor
 59731: ffffffff81d1b980   128 OBJECT  LOCAL  DEFAULT   16 cubictcp
 59732: ffffffff81e72d7d    18 FUNC    LOCAL  DEFAULT   27 cubictcp_unregister
 59733: ffffffff816e5b80   705 FUNC    LOCAL  DEFAULT    1 bictcp_cong_avoid.part.3
 59734: ffffffff81d1ba2c     4 OBJECT  LOCAL  DEFAULT   16 tcp_friendliness
 59735: ffffffff818a3160    64 OBJECT  LOCAL  DEFAULT    4 v.49260
 59736: ffffffff816e5e50   259 FUNC    LOCAL  DEFAULT    1 bictcp_cong_avoid
 59737: ffffffff816e5f60   336 FUNC    LOCAL  DEFAULT    1 hystart_update
 59738: ffffffff81d1ba08     4 OBJECT  LOCAL  DEFAULT   16 hystart_detect
 59739: ffffffff81d1ba0c     4 OBJECT  LOCAL  DEFAULT   16 hystart_ack_delta
 59740: ffffffff816e60b0   217 FUNC    LOCAL  DEFAULT    1 bictcp_acked
 59741: ffffffff81d1ba04     4 OBJECT  LOCAL  DEFAULT   16 hystart_low_window
 59742: ffffffff816e6190   287 FUNC    LOCAL  DEFAULT    1 bictcp_init
 59743: ffffffff81d1ba18     4 OBJECT  LOCAL  DEFAULT   16 initial_ssthresh
 59744: ffffffff816e62b0   249 FUNC    LOCAL  DEFAULT    1 bictcp_state
 59745: ffffffff81b6e778     8 OBJECT  LOCAL  DEFAULT   15 __modver_attr
 59746: ffffffff81ce1b40    72 OBJECT  LOCAL  DEFAULT   16 ___modver_attr
 59747: ffffffff81e597c8     8 OBJECT  LOCAL  DEFAULT   20 __initcall_cubictcp_regis
 59748: ffffffff81b6e528    32 OBJECT  LOCAL  DEFAULT   14 __param_hystart_ack_delta
 59749: ffffffff818a31a0    28 OBJECT  LOCAL  DEFAULT    4 __param_str_hystart_ack_d
 59750: ffffffff81b6e548    32 OBJECT  LOCAL  DEFAULT   14 __param_hystart_low_windo
 59751: ffffffff818a31c0    29 OBJECT  LOCAL  DEFAULT    4 __param_str_hystart_low_w
 59752: ffffffff81b6e568    32 OBJECT  LOCAL  DEFAULT   14 __param_hystart_detect
 59753: ffffffff818a31e0    25 OBJECT  LOCAL  DEFAULT    4 __param_str_hystart_detec
 59754: ffffffff81b6e588    32 OBJECT  LOCAL  DEFAULT   14 __param_hystart
 59755: ffffffff818a3200    18 OBJECT  LOCAL  DEFAULT    4 __param_str_hystart
 59756: ffffffff81b6e5a8    32 OBJECT  LOCAL  DEFAULT   14 __param_tcp_friendliness
 59757: ffffffff818a3220    27 OBJECT  LOCAL  DEFAULT    4 __param_str_tcp_friendlin
 59758: ffffffff81b6e5c8    32 OBJECT  LOCAL  DEFAULT   14 __param_bic_scale
 59759: ffffffff818a3240    20 OBJECT  LOCAL  DEFAULT    4 __param_str_bic_scale
 59760: ffffffff81b6e5e8    32 OBJECT  LOCAL  DEFAULT   14 __param_initial_ssthresh
 59761: ffffffff818a3260    27 OBJECT  LOCAL  DEFAULT    4 __param_str_initial_ssthr
 59762: ffffffff81b6e608    32 OBJECT  LOCAL  DEFAULT   14 __param_beta
 59763: ffffffff818a327b    15 OBJECT  LOCAL  DEFAULT    4 __param_str_beta
 59764: ffffffff81b6e628    32 OBJECT  LOCAL  DEFAULT   14 __param_fast_convergence
 59765: ffffffff818a3290    27 OBJECT  LOCAL  DEFAULT    4 __param_str_fast_converge

And the non-patched version:

     1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS tcp_cubic.c
     6: 0000000000000000   106 FUNC    LOCAL  DEFAULT    4 bictcp_recalc_ssthresh
     7: 0000000000000090     4 OBJECT  LOCAL  DEFAULT   32 fast_convergence
     8: 0000000000000094     4 OBJECT  LOCAL  DEFAULT   32 beta
    10: 0000000000000000    30 FUNC    LOCAL  DEFAULT    6 bictcp_undo_cwnd
    12: 0000000000000000   106 FUNC    LOCAL  DEFAULT    8 cubictcp_register
    13: 00000000000000b4     4 OBJECT  LOCAL  DEFAULT   32 bic_scale
    14: 00000000000000a8     4 OBJECT  LOCAL  DEFAULT   32 cube_rtt_scale
    15: 00000000000000b0     4 OBJECT  LOCAL  DEFAULT   32 beta_scale
    16: 0000000000000080     4 OBJECT  LOCAL  DEFAULT   32 hystart
    17: 00000000000000a0     8 OBJECT  LOCAL  DEFAULT   32 cube_factor
    18: 0000000000000000   128 OBJECT  LOCAL  DEFAULT   32 cubictcp
    20: 0000000000000000    18 FUNC    LOCAL  DEFAULT   10 cubictcp_unregister
    22: 0000000000000000   705 FUNC    LOCAL  DEFAULT   12 bictcp_cong_avoid.part.3
    23: 00000000000000ac     4 OBJECT  LOCAL  DEFAULT   32 tcp_friendliness
    24: 0000000000000000    64 OBJECT  LOCAL  DEFAULT   34 v.49260
    26: 0000000000000000   259 FUNC    LOCAL  DEFAULT   14 bictcp_cong_avoid
    28: 0000000000000000   336 FUNC    LOCAL  DEFAULT   16 hystart_update
    29: 0000000000000088     4 OBJECT  LOCAL  DEFAULT   32 hystart_detect
    30: 000000000000008c     4 OBJECT  LOCAL  DEFAULT   32 hystart_ack_delta
    32: 0000000000000000   217 FUNC    LOCAL  DEFAULT   18 bictcp_acked
    33: 0000000000000084     4 OBJECT  LOCAL  DEFAULT   32 hystart_low_window
    35: 0000000000000000   287 FUNC    LOCAL  DEFAULT   20 bictcp_init
    36: 0000000000000098     4 OBJECT  LOCAL  DEFAULT   32 initial_ssthresh
    38: 0000000000000000   249 FUNC    LOCAL  DEFAULT   22 bictcp_state
    40: 0000000000000000     8 OBJECT  LOCAL  DEFAULT   24 __modver_attr
    41: 0000000000000000    72 OBJECT  LOCAL  DEFAULT   36 ___modver_attr
    43: 0000000000000000     8 OBJECT  LOCAL  DEFAULT   26 __exitcall_cubictcp_unreg
    45: 0000000000000000     8 OBJECT  LOCAL  DEFAULT   28 __initcall_cubictcp_regis
    47: 0000000000000000    32 OBJECT  LOCAL  DEFAULT   30 __param_hystart_ack_delta
    48: 0000000000000000    28 OBJECT  LOCAL  DEFAULT   38 __param_str_hystart_ack_d
    49: 0000000000000020    32 OBJECT  LOCAL  DEFAULT   30 __param_hystart_low_windo
    50: 0000000000000000    29 OBJECT  LOCAL  DEFAULT   39 __param_str_hystart_low_w
    51: 0000000000000040    32 OBJECT  LOCAL  DEFAULT   30 __param_hystart_detect
    52: 0000000000000000    25 OBJECT  LOCAL  DEFAULT   40 __param_str_hystart_detec
    53: 0000000000000060    32 OBJECT  LOCAL  DEFAULT   30 __param_hystart
    54: 0000000000000000    18 OBJECT  LOCAL  DEFAULT   41 __param_str_hystart
    55: 0000000000000080    32 OBJECT  LOCAL  DEFAULT   30 __param_tcp_friendliness
    56: 0000000000000000    27 OBJECT  LOCAL  DEFAULT   42 __param_str_tcp_friendlin
    57: 00000000000000a0    32 OBJECT  LOCAL  DEFAULT   30 __param_bic_scale
    58: 0000000000000000    20 OBJECT  LOCAL  DEFAULT   43 __param_str_bic_scale
    59: 00000000000000c0    32 OBJECT  LOCAL  DEFAULT   30 __param_initial_ssthresh
    60: 0000000000000000    27 OBJECT  LOCAL  DEFAULT   44 __param_str_initial_ssthr
    61: 00000000000000e0    32 OBJECT  LOCAL  DEFAULT   30 __param_beta
    62: 0000000000000000    15 OBJECT  LOCAL  DEFAULT   45 __param_str_beta
    63: 0000000000000100    32 OBJECT  LOCAL  DEFAULT   30 __param_fast_convergence
    64: 0000000000000000    27 OBJECT  LOCAL  DEFAULT   46 __param_str_fast_converge

The only difference is that I am missing this line in /usr/lib/debug/.../vmlinux:

    43: 0000000000000000     8 OBJECT  LOCAL  DEFAULT   26 __exitcall_cubictcp_unreg

If I look at the final vmlinux (compiled with the patch), the line is also missing. I suppose that the function is stripped because the .o becomes "built-in". And I see this case is handled in lookup.c. In my case, table->vmlinux is 0, so the symbol is not discarded. The problem is that my debug vmlinux is vmlinux-3.13.0-117-generic, not vmlinux. I'll cook a patch for this.

vincentbernat commented 7 years ago

After applying #707, I still have 4 errors (one of them is mpls_gso.o). I'll investigate them later.

vincentbernat commented 7 years ago

So, the remaining problems for me are:

 30046: ffffffff8138e2e0     9 FUNC    LOCAL  DEFAULT    1 copy_page_rep

Any idea for the last one? Maybe copy_page_64.S got combined with vsprintf.o at link time?

joe-lawrence commented 7 years ago

Hi @vincentbernat , I don't know if you're still plugging away at the building your patch, but I've collected a few Ubuntu fixes in my own tree for testing (in the aptly named "ubuntu_fixes" branch). See the commit log for an example usage of KPATCH_GCC_OBJ_IGNORE to help avoid #701 and #708.

With the test branch and running Ubuntu 3.13.0-119-generic, I can build and load the meminfo_string test integration example.

However, the patch you attached in the third comment still has issues:

Anyway, just thought I'd point you to that (slow) work-in-progress. You might have better luck if you can modify the patch to pare down the number affected functions. Thanks for all the bug reports and testing!

vincentbernat commented 7 years ago

Thanks for the heads up. Unfortunately, I don't currently have the time to debug more (and I have finally opted to reboot the affected kernels, so I don't have an immediate use of the patch). My next step was to modify the patch to not modify any .h, to put the code in a dedicated .h (with a different function name) and only include it in the .c files that need it.

As for -fno-inline-functions-called-once, this is still used in Ubuntu 4.4 (in Xenial).

dalehamel commented 6 years ago

@vincentbernat @joe-lawrence Sorry to revive an old thread, but i've hit a similar problem (I think) and I wonder if I could benefit from your insights, having faced what seems to be a similar issue.

/usr/libexec/kpatch/create-diff-object: ERROR: meminfo.o: find_local_syms: 174: find_local_syms for meminfo.c: couldn't find in vmlinux symbol table

Even applying a trivial patch is causing me the same issue (such as the one in the readme):


---
 fs/proc/meminfo.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c
index 6bb20f864259..97a62c33c130 100644
--- a/fs/proc/meminfo.c
+++ b/fs/proc/meminfo.c
@@ -133,7 +133,7 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
        seq_printf(m, "VmallocTotal:   %8lu kB\n",
                   (unsigned long)VMALLOC_TOTAL >> 10);
        show_val_kb(m, "VmallocUsed:    ", 0ul);
-       show_val_kb(m, "VmallocChunk:   ", 0ul);
+       show_val_kb(m, "VMallocChunk:   ", 0ul);

 #ifdef CONFIG_MEMORY_FAILURE
        seq_printf(m, "HardwareCorrupted: %5lu kB\n",
--
2.16.4

The patch I am actually tring to apply is this one against kernel sources that I have built myself fore ContainerOS (based on chromium, which is based on gentoo).

The specific kernel revision I am trying to patch is: https://chromium.googlesource.com/chromiumos/third_party/kernel/+/e7d0ede66eb4efd21bf8935391cc2797f480c15f, and I'm trying to apply one patch that reverts https://chromium.googlesource.com/chromiumos/third_party/kernel/+/27f29dbceb3c979d00833a90aa27ff0756ecc1e0 and a second one that applies https://github.com/torvalds/linux/commit/f0a2aa5a2a406d0a57aa9b320ffaa5538672b6c5#diff-bea6197e6c8c9c3d726c6514ca7c5a03

I am running an instance in GKE where I want to apply the above patch to get more debugging visibility via uprobes, and literally just need to change the address of one function call to another per the patch.

However, I don't think it is the patch that is the problem - any patch I try to apply gives me similar errors.

I've been using 0.6.0 and 0.6.1 and get the same problem (only 0.6.1 gives me a better error message, indicating vmlinux specifically).

The only file that changes is trace_uprobe.o, but kpatch_build fails as such.

Here is an excerpt from the vmlinux symbol table: https://gist.github.com/dalehamel/599d5e4bd3d34511fb6126382ca5ed7d

And here are the before and after symbol tables from the patch

orig: https://gist.github.com/dalehamel/65d5b0503a343e3e47f0c3435ccbb204 patched: https://gist.github.com/dalehamel/de6021f5bbafd5bce7e09f70d4912f54

I'm pretty stumped... does the error message indicate that trace_uprobe.c is the symbol it is looking for in vmlinux, and can't find it? Or is it looking for all of the symbols from trace_uprobe.c? Does it only look for ones that changed?

It is also odd to me that all those values are 0'd out in the uprobe object file, but it's been a while since I've worked with the ELF format.

Any insights you can share to get me unstuck would be very appreciated

dalehamel commented 6 years ago

I've been invoking kpatch-build as such and getting the following output:

kpatch-build patch1.patch patch2.patch -c .config  -v /vmlinux
  -s /usr/src/linux  -t vmlinux
Using source directory at /usr/src/kernel
Testing patch file(s)
Reading special section data
Building original kernel
Building patched kernel
Extracting new and modified ELF sections
/usr/libexec/kpatch/create-diff-object: ERROR: trace_uprobe.o: find_local_syms: 174: find_local_syms for trace_uprobe.c: couldn't find in vmlinux symbol table
ERROR: 1 error(s) encountered. Check /root/.kpatch/build.log for more details.

/vmlinux is where the kernel I want to patch is, but I get the same error trying to use the 'orig' kernel produced by just running make vmlinux in the kernel source directory.

I have verified the GCC versions match exactly. The only weird thing I'm doing is running this on an unsupported distro ,but the only distro-specific logic seems to be downloading kernel sources and debug symbols.... but I'm generating everything from source.

jpoimboe commented 6 years ago

It's trying to find the part of the symbol table in vmlinux which corresponds to the orig .o symbol table. It expects them to match up exactly. I think the problem is that vmlinux has the following symbols:

12481: ffffffff821051c8      1 OBJECT  LOCAL  DEFAULT       21 __warned.42167
12486: ffffffff82520388      0 OBJECT  LOCAL  DEFAULT       55 __key.41841
12495: ffffffff82520388      0 OBJECT  LOCAL  DEFAULT       55 __key.41683

whereas the orig.o has:

   84: 0000000000000000      1 OBJECT  LOCAL  DEFAULT      135 __warned.42176
   96: 0000000000000000      0 OBJECT  LOCAL  DEFAULT      133 __key.41850
  121: 0000000000000000      0 OBJECT  LOCAL  DEFAULT      134 __key.41692

Notice the numbered suffixes differ. This indicates that the .o file built by kpatch-build was somehow compiled differently than the original vmlinux: maybe some different compiler flags where used? Or some subtle difference in the sources or compilers used.

You could hack around it by changing the strcmp logic in locals_match() to allow differences after the '.' character, but that would still be covering up a root issue of some compiled difference between the files. They should be identical.

dalehamel commented 6 years ago

Thanks for the response, that gives me a next step for where to look - go find out if the compiler flags are different, which I should be able to do by just turning up the verbosity of the build system.

I think the hackier solution may also work, as I don’t mind if more object code is replaced here so long as it is only for that file’s symbols.

I guess to check success, when compiling the reference kernel and my orig patches I should verify these numbers match exactly, that should mean I’ve hit the exact flags and compiler conditions I suppose.

NKTelnet commented 5 years ago

If I only have kernel rpm file, how can I get the compiler flags for this kernel rpm file ?

joe-lawrence commented 5 years ago

@NKTelnet : I'm not sure if there is anyway to dig that out of a plain RPM file (maybe there are ELF sections saved into vmlinux, but I doubt it). The easiest way would be to find and build a source rpm of the same version, then you can inspect the .filename.o.cmd files that are left behind.

jpoimboe commented 5 years ago

As far as I can tell issue seems to be obsolete, or at least not going anywhere. Please open a new issue if you still have a problem. Thanks.

NKTelnet commented 5 years ago

@NKTelnet : I'm not sure if there is anyway to dig that out of a plain RPM file (maybe there are ELF sections saved into vmlinux, but I doubt it). The easiest way would be to find and build a source rpm of the same version, then you can inspect the .filename.o.cmd files that are left behind.

Thanks a lot, Joe. I can not find source rpm, because it is a private built rpm.