dynup / kpatch

kpatch - live kernel patching
GNU General Public License v2.0
1.49k stars 305 forks source link

Binary patch for CVE-2016-3134 fails to load (CentOS 7, kernel 3.10.0-327.4.4) #612

Closed euspectre closed 1 year ago

euspectre commented 7 years ago

I encounered the following problem while experimenting with Kpatch on CentOS 7 with kernel 3.10.0-327.4.4.

kpatch-build reports no errors when building a binary patch from the fix for CVE-2016-3134, netfilter-fix-unconditional-helper.patch, but the resulting binary patch fails to load with the following errors in dmesg:

kpatch: symbol 'uncond.48011' not found in symbol table
kpatch: unable to find symbol 'uncond.48011'

The modules to be patched have similar symbols but with the different numeric suffixes added by GCC:

# grep -F uncond. /proc/kallsyms 
ffffffffa03914a0 b uncond.48012 [ip6_tables]
ffffffffa00c8580 b uncond.48784 [ip_tables]

It seems that either these static locals were not correlated properly by create-diff-object or they cannot be correlated at all. If the latter is the case, create-diff-object should report an error like it usually does in such cases.

Another question is, why GCC gave these variables the different numeric suffixes compared to what it did when building the kernel packages for CentOS. But this is beyond the scope of this issue, perhaps. I use the same GCC version as was used for the kernel packages, GCC 4.8.3 20140911 (Red Hat 4.8.3-9).

Interestingly enough, the same binary patch builds and loads just fine on the "neighbor" kernel versions, 3.10.0-327.3.1 and 3.10.0-327.4.5.

zhouchengming1 commented 7 years ago

Hi, I think you didn't specify the vmlinux when you do kpatch-build, did you? So the vmlinux the kpatch-build uses will be the vmlinux compiled from the base code, for some reason, this vmlinux is not the same as the vmlinux you are running, like base-vmlinux has the symbol 'uncond.48011', but the running-vmlinux has the symbol 'uncond.48012', so report this error when you load the patch module.

euspectre commented 7 years ago

I extracted vmlinux from the corresponding RPM with debug info from CentOS's repository and did pass it to kpatch-build as usual.

I did the same (with the appropriate packages, of course) when building binary patches for 3.10.0-327.3.1 and 3.10.0-327.4.5 and the patch was built and worked OK.

By the way, the symbol uncond.48012 is not from vmlinux as you can see, but rather from ip6_tables.ko.

Again, the main question here is not why GCC gave these variables different names but why create-diff-object gave no warning about it and thought that it prepared the patch successfully.

Any ideas?

zhouchengming1 commented 7 years ago

I extracted vmlinux from the corresponding RPM with debug info from CentOS's repository and did > pass it to kpatch-build as usual.

Oh, so any needed symbols of vmlinux will be the same, and will be found.

I did the same (with the appropriate packages, of course) when building binary patches for 3.10.0-327.3.1 and 3.10.0-327.4.5 and the patch was built and worked OK.

By the way, the symbol uncond.48012 is not from vmlinux as you can see, but rather from ip6_tables.ko.

I think this maybe the problem, ip6_tables.ko the create-diff-object used is compiled from the base code, maybe not the same as the ip6_tables.ko your machine already loaded.

Again, the main question here is not why GCC gave these variables different names but why create-diff-object gave no warning about it and thought that it prepared the patch successfully.

If local static symbols are correctly correlated, and can be found in the ip6_tables.ko the create-diff-object used, then no warning will be given.

Any ideas?

So I think the prolem is why the ip6_tables.ko compiled from the base code is not the same as the ip6_tables.ko your machine loaded, maybe you used a different gcc ? So how do you think ?

zhouchengming1 commented 7 years ago

Another question is, why GCC gave these variables the different numeric suffixes compared to what it did when building the kernel packages for CentOS. But this is beyond the scope of this issue, perhaps. I use the same GCC version as was used for the kernel packages, GCC 4.8.3 20140911 (Red > Hat 4.8.3-9).

Sorry, I didn't see this. So you use the same version gcc when you build kernel packages and do kpatch-build, then I thiink the numeric suffixes should be the same ( if code are the same) ... Is there any possibility that some gcc flags may influence the numeric suffixes ? Any ideas ?

euspectre commented 7 years ago

Ah, so it becomes a bit clearer, thanks. create-diff-object used the freshly built object files for the unpatched and patched ip6_tables.ko and correlated the static locals successfully. The original ip6_tables.ko that was running was not used there, that is why the problem was not detected. Makes sense.

Is there any possibility that some gcc flags may influence the numeric suffixes ? They might, I think. IIRC, GCC uses the same counter for these suffixes when copying functions. It might do the same for variables as well. Disable some optimization that was previously enabled (or vice versa) and you are likely to get different values of this counter.

Come to think about it, this kernel is special, it seems. Unlike 3.10.0-327.3.1 and 3.10.0-327.4.5, 3.10.0-327.4.4 has non-trivial CentOS-specific patches apart from the usual debranding ones.

I checked that the kernel source I used (unpacked SRPM + rpmbuild -bp) does have these patches applied. Still (pure guess), the original kernel CentOS uses might have been built with a slightly different patchset or something like that. I might be wrong here though.

It could be interesting to try to build that patch for the same version of the "pure" RH kernel rather than for that one from CentOS's repos. If the patchset is an issue, the binary patch should be OK this time. Though, I have no time for that right now, unfortunately.

jpoimboe commented 7 years ago

Sorry for the delay in looking at this. While I haven't looked into this problem very deeply, I suspect this is a known issue: #545. 'uncond' is a static local variable. gcc always adds the numeric suffix to static local variables. That numeric suffix is based on the symbol token identifier, which is subject to change after patching the code.

545 is an important issue, we just haven't found the time to fix it yet...

github-actions[bot] commented 1 year ago

This issue has been open for 30 days with no activity and no assignee. It will be closed in 7 days unless a comment is added.

euspectre commented 1 year ago

The newly enabled "stale workflow" works wonders ;-)

I forgot about this issue. It is long since obsolete, I'll close it.