One sentence summary of this PR (This should go in the CHANGELOG!)
Add separate data sections support to LLVM toolchain, and add general flag for including subsections
Add PatchMaker option to auto-choose a linux header include path
Link to Related Issue(s)
Extends and replaces #203
Please describe the changes in your request.
The addition of the separate_data_sections functionality to the LLVM toolchain aligns it with the GNU toolchain.
The other significant alignment is allowing LLVM to link against data symbols by making stub symbols strong instead of weak (a weak symbol definition is a full definition, but the linker allows multiple definitions as long as only one of them is strong; the strong definition will always be chosen over the weak). Then the data symbols we want to use in the target binary are defined in the patch source as "weak", very similar to using extern, however, extern symbols require LLVM to generate a Global Offset Table which is difficult to inject into a patch binary. Making these symbols weak requires treating weak symbols as "unresolved." I made changes to the parsers, renamed the fields holding the symbols to be clearer what they mean, and added some nice tests that actually caught a bug parsing x64 object files!
In order to get Clang to obey the "weak" attribute, I had to change the LLVM toolchain to use the normal GCC-like interface, and changing the associated flags. In theory we can use -Xclang to pass option which are only in the -cc1 frontend we were using previously. I've found that hard to test.
The least related change, and likely what bears the most scrutiny, is adding functionality to automatically include the linux cross-compile headers. These headers are necessary for any patch which would need to use things defined in these headers; because we can install the standard headers and pretty easily determine which set of headers we need, and it would be a pain for a user to figure out on their own (you get errors about headers being missing, most google results will suggest you install gcc-multilib which actually uninstalls all of our GNU toolchains), I think it's very much worth adding. Including these headers makes it much more likely a user's patch "just works."
That option currently defaults to False (don't auto-include these headers). However, the PatchFromSourceModifier and FunctionReplaceModifier always use it.
Also note that it is a PatchMaker option, not a Toolchain option. I'm willing to move it to ToolchainConfig, but I wanted to test the waters for moving some options out of ToolchainConfig and into PatchMaker. My thoughts are that some options are much more related to the "meta" workings of PatchMaker than the actual object files being built. Some other options that I believe would more naturally fit in PatchMaker.init and not ToolchainConfig are:
create_map_files
check_overlap
userspace_dynamic_linker (not too sure about this option)
perhaps debug_info
These are options that are not so much concerned with that actual code in the object files, but additional functionality/features that may be generated in addition to the code.
Anyone you think should look at this, specifically?
@whyitfor
@dannyp303
One sentence summary of this PR (This should go in the CHANGELOG!)
Link to Related Issue(s) Extends and replaces #203
Please describe the changes in your request. The addition of the
separate_data_sections
functionality to the LLVM toolchain aligns it with the GNU toolchain. The other significant alignment is allowing LLVM to link against data symbols by making stub symbols strong instead of weak (a weak symbol definition is a full definition, but the linker allows multiple definitions as long as only one of them is strong; the strong definition will always be chosen over the weak). Then the data symbols we want to use in the target binary are defined in the patch source as "weak", very similar to usingextern
, however,extern
symbols require LLVM to generate a Global Offset Table which is difficult to inject into a patch binary. Making these symbols weak requires treating weak symbols as "unresolved." I made changes to the parsers, renamed the fields holding the symbols to be clearer what they mean, and added some nice tests that actually caught a bug parsing x64 object files!In order to get Clang to obey the "weak" attribute, I had to change the LLVM toolchain to use the normal GCC-like interface, and changing the associated flags. In theory we can use
-Xclang
to pass option which are only in the-cc1
frontend we were using previously. I've found that hard to test.The least related change, and likely what bears the most scrutiny, is adding functionality to automatically include the linux cross-compile headers. These headers are necessary for any patch which would need to use things defined in these headers; because we can install the standard headers and pretty easily determine which set of headers we need, and it would be a pain for a user to figure out on their own (you get errors about headers being missing, most google results will suggest you install
gcc-multilib
which actually uninstalls all of our GNU toolchains), I think it's very much worth adding. Including these headers makes it much more likely a user's patch "just works."That option currently defaults to
False
(don't auto-include these headers). However, thePatchFromSourceModifier
andFunctionReplaceModifier
always use it.Also note that it is a PatchMaker option, not a Toolchain option. I'm willing to move it to ToolchainConfig, but I wanted to test the waters for moving some options out of ToolchainConfig and into PatchMaker. My thoughts are that some options are much more related to the "meta" workings of PatchMaker than the actual object files being built. Some other options that I believe would more naturally fit in PatchMaker.init and not ToolchainConfig are:
create_map_files
check_overlap
userspace_dynamic_linker
(not too sure about this option)debug_info
These are options that are not so much concerned with that actual code in the object files, but additional functionality/features that may be generated in addition to the code.Anyone you think should look at this, specifically? @whyitfor @dannyp303