One sentence summary of this PR (This should go in the CHANGELOG!)
Switch to standard GCC-like frontend for LLVM, which supports C attribute(weak)
Treat weak symbols as "undefined" in BOM, so alternative, strong definitions can be searched
Generate LinkableBinary stubs as strong symbols, so linker use them to override weak symbols in patch
Link to Related Issue(s)
This makes it easier to link a patch against data symbols (e.g. global variables) in the target. The main challenge, compared to linking against functions, is that compilers often want to generate a Global Offset Table which adds a level of indirection to the pointer to the data. It also adds a lot of complication to injecting the patch, since you have to also somehow inject this Global Offset Table, plus a runtime relocation entry, it's a pain.
Please describe the changes in your request.
Allow LLVM to link against data symbols by making stub symbols strong instead of weak (a weak symbol definition is a full definition, but the linker allows multiple definitions as long as only one of them is strong; the strong definition will always be chosen over the weak). Then the data symbols we want to use in the target binary are defined in the patch source as "weak", very similar to using extern, however, extern symbols require LLVM to generate a Global Offset Table. Making these symbols weak requires treating weak symbols as "unresolved." I made changes to the parsers, renamed the fields holding the symbols to be clearer what they mean, and added some nice tests that actually caught a bug parsing x64 object files!
In order to get Clang to obey the "weak" attribute, I had to change the LLVM toolchain to use the normal GCC-like interface, and changing the associated flags. In theory we can use -Xclang to pass option which are only in the -cc1 frontend we were using previously. I've found that hard to test.
Anyone you think should look at this, specifically?
One sentence summary of this PR (This should go in the CHANGELOG!)
Link to Related Issue(s) This makes it easier to link a patch against data symbols (e.g. global variables) in the target. The main challenge, compared to linking against functions, is that compilers often want to generate a Global Offset Table which adds a level of indirection to the pointer to the data. It also adds a lot of complication to injecting the patch, since you have to also somehow inject this Global Offset Table, plus a runtime relocation entry, it's a pain.
Please describe the changes in your request.
Allow LLVM to link against data symbols by making stub symbols strong instead of weak (a weak symbol definition is a full definition, but the linker allows multiple definitions as long as only one of them is strong; the strong definition will always be chosen over the weak). Then the data symbols we want to use in the target binary are defined in the patch source as "weak", very similar to using extern, however, extern symbols require LLVM to generate a Global Offset Table. Making these symbols weak requires treating weak symbols as "unresolved." I made changes to the parsers, renamed the fields holding the symbols to be clearer what they mean, and added some nice tests that actually caught a bug parsing x64 object files!
In order to get Clang to obey the "weak" attribute, I had to change the LLVM toolchain to use the normal GCC-like interface, and changing the associated flags. In theory we can use -Xclang to pass option which are only in the -cc1 frontend we were using previously. I've found that hard to test.
Anyone you think should look at this, specifically?