SUSE / libpulp

libpulp enables live patching in user space applications.
GNU Lesser General Public License v2.1
55 stars 11 forks source link

WIP: Add ulp_buildid #34

Closed simotek closed 3 years ago

simotek commented 3 years ago

As we will need to ship a live patch for each version of any lib we release, we will have multiple live patches to choose from in many cases. This new command takes a library name and a pid then returns the NT_GNU_BUILD_ID so that a script run as part of the rpm install process can determine which live patch to apply.

Example usage: ulp_buildid -p 28036 /usr/lib64/libcrypto.so.1.1

This is needed because we won't know which version of a library that we need to live patch, in some cases we could have 5-6 choices.

Its along time since i've written much C and when I did it was on a compiler that only supported a pretty old version of the standard so any feedback would be welcome. I just wanted to have something at the point where it was ready to get feedback.

ulp_buildid should probably be in libexecdir as its not really useful for users but I didn't get to that yet.

inconstante commented 3 years ago

As we will need to ship a live patch for each version of any lib we release, we will have multiple live patches to choose from in many cases. This new command takes a library name and a pid then returns the NT_GNU_BUILD_ID so that a script run as part of the rpm install process can determine which live patch to apply.

Thanks for the explanation. :)

Example usage: ulp_buildid -p 28036 /usr/lib64/libcrypto.so.1.1

This is needed because we won't know which version of a library that we need to live patch, in some cases we could have 5-6 choices.

OK.

Its along time since i've written much C and when I did it was on a compiler that only supported a pretty old version of the standard so any feedback would be welcome. I just wanted to have something at the point where it was ready to get feedback.

I wouldn't have known if you hadn't mentioned. :)

ulp_buildid should probably be in libexecdir as its not really useful for users but I didn't get to that yet.

Makes sense. Just to make sure, this doesn't require changes in this project, right? Rather in the .spec file.

Other than the comments above, I'd like to see a new manpage popping up for the new tool, if you have the time to write it (if you don't we should at least document its absence with a new github issue). Likewise in the README file.

susematz commented 3 years ago

As a general comment: I want all user facing functionality to be part of a single 'ulp' executable, not split over multiple executables. And I positively hate underscores in executables names. I consider printing the build id to be user facing because a random admin might be interested in that, not just the live patch build process, so please integrate this into 'ulp' as subcommand.

To perhaps clarify my mental model: anything that takes a PID is potentially user facing. Anything that only consumes an ELF file and some metadata, and emits either other ELF files or other metadata would not be user facing (e.g. used only in the livepatch build process). The latter can stay separate tools, from my perspective.

simotek commented 3 years ago

As a general comment: I want all user facing functionality to be part of a single 'ulp' executable, not split over multiple executables. And I positively hate underscores in executables names. I consider printing the build id to be user facing because a random admin might be interested in that, not just the live patch build process, so please integrate this into 'ulp' as subcommand.

To perhaps clarify my mental model: anything that takes a PID is potentially user facing. Anything that only consumes an ELF file and some metadata, and emits either other ELF files or other metadata would not be user facing (e.g. used only in the livepatch build process). The latter can stay separate tools, from my perspective.

This isn't really designed or intended to be a "user facing" which is why I was suggesting installing it to /usr/libexec. My intention here was to use it as part of an automated process to apply live patches on package install. The easiest and tidiest way to do this in my mind is to do this with a scripting language calling separate binaries where needed. The current work in progress prototype for this is https://github.com/simotek/libpulp/blob/tools/tools/ulp_apply Usage ./ulp_apply "/usr/lib64/libcrypto.so.1.1" This will then patch libcrypto in every running process with libpulp loaded (Note I need to push some changes to ulp_dump so it will work for everyone). Having said that maybe this tool would also be useful if someone felt the need to apply patches manually, i'm not sure what we actually want to support here.

I guess I was trying to be consistent with ulp_dump, ulp_trigger and ulp_reverse so the question is should they all be intergrated as sub commands of the ulp binary? if so should we do that all at the same time, or should we just optionally add the build_id as part of the broader info it prints.

Currently ulp doesn't seem well setup to handle sub commands so maybe that should be done all at once.

simotek commented 3 years ago

As we will need to ship a live patch for each version of any lib we release, we will have multiple live patches to choose from in many cases. This new command takes a library name and a pid then returns the NT_GNU_BUILD_ID so that a script run as part of the rpm install process can determine which live patch to apply.

Thanks for the explanation. :)

Example usage: ulp_buildid -p 28036 /usr/lib64/libcrypto.so.1.1 This is needed because we won't know which version of a library that we need to live patch, in some cases we could have 5-6 choices.

OK.

Its along time since i've written much C and when I did it was on a compiler that only supported a pretty old version of the standard so any feedback would be welcome. I just wanted to have something at the point where it was ready to get feedback.

I wouldn't have known if you hadn't mentioned. :)

ulp_buildid should probably be in libexecdir as its not really useful for users but I didn't get to that yet.

Makes sense. Just to make sure, this doesn't require changes in this project, right? Rather in the .spec file.

Ideally we'd modify the autotools scripts so that make install puts the file in whatever we specify libexecdir to be (Currently its different between tumbleweed and SLE/Leap)

Other than the comments above, I'd like to see a new manpage popping up for the new tool, if you have the time to write it (if you don't we should at least document its absence with a new github issue). Likewise in the README file.

Sure I can do that, I might wait till we answer some of the other questions in this thread though.

inconstante commented 3 years ago

Ideally we'd modify the autotools scripts so that make install puts the file in whatever we specify libexecdir to be (Currently its different between tumbleweed and SLE/Leap)

I stand corrected. :)

If I understand the manual correctly, using sbin_PROGRAMS rather than bin_PROGRAMS should do the trick, right?

giulianobelinassi commented 3 years ago

Hi,

To be fair, I don't think the way this feature is currently implemented is correct:

In tools/introspection.c, the function get_loaded_symbol_addr currently (on master) is already parsing the target process ELF headers in memory to find the .dynsym, .dynstr and .hash sections of each library linked against it. Therefore, I think using the structure of this function to recover the library's build id is more suitable in this case, because we don't have to expose new symbols in libpulp and certainly the patch will be a lot smaller.

Furthermore, if you proceed to do what I suggested, we could even extract the target's process ELF header parsing from get_loaded_symbol_addr to another function that finds the .dynsym, .dynstr, .hash and build_id, and add those information into the struct ulp_dynobj that represents each library. Then you just have to compare the build id found in that object.

(And get_loaded_symbol_addr could be refactored to only parse the remote process information to retrieve the symbols, but not to find the sections again.)

giulianobelinassi commented 3 years ago

Please check if #59 is what you had in mind when implementing this patch. If you don't like the output string or how the output is displayed, this can be changed of course :).

giulianobelinassi commented 3 years ago

A similar feature was implemented in ulp patches, and the build id of a loaded library can now be retrieved by:

$ ulp patches -b -p $(pid) | grep libdl.so | grep -oEi '([0-9a-f]){40}

So I am closing this PR.