Closed sompen closed 3 years ago
Is there documentation from ARM on what this feature does and what it is used for?
@jpakkane, The partial linking model description is available in the armlink user guide in ARM website, please find the following related links: Overview of linking models (In this page please refer to the "Partial linking" section), Partial linking model
@jpakkane Could you please review the changes once and tell us if any other changes are required to get this PR merged to master?
I read through the linked pages but things are not really clear to me based on that. What is the specific problem partial linking solves and how does it do it differently from regular linking?
We are building software for embedded systems, where a final link step generates the image to be loaded onto the device by combining together several subsystems. Those subsystems may contain duplicated symbols (e.g. each subsystem might have its own main() function, malloc() function and a few others). We need to link each subsystem together, resolving all internal references; this means we can’t use an archive, which is what a meson static_library produces. But our embedded system doesn’t do dynamic loading, so we don’t want the full shared library support. Partial linking solves that problem for us.
Included partial library linking support for Gnu compiler too.
In practical use does is this an option that you always want to use on all build targets or only a subset of them?
We will use this for all build targets (shared_library()) in the project.
@jpakkane - Do you need more information on this before we can merge this in? Are there any outstanding concerns we should be addressing here? thanks
Sorry for the delay, but the regressions in 0.51.0 have taken a lot of time so we have had to focus on those first. There are still a few outstanding but we'll get to this immediately after the point release is out.
Based on the docs and comments, this is roughly the way I understand your setup.
shared_library
executable
) that has some of its own code and libraries mentioned aboveIs this correct?
And if it is, would adding the partial linking argument to c_link_args
in the cross file fail because it would try to use it in the executable target as well or would it actually work?
My comments inline below Based on the docs and comments, this is roughly the way I understand your setup.
Is this correct? And if it is, would adding the partial linking argument to c_link_args in the cross file fail because it would try to use it in the executable target as well or would it actually work? [MB] - If you try this with a GCC toolchain it puts both the ‘-shared’ and ‘-r’ arguments on the linker command line, which then fails. Additionally it always sets -fPIC for shared libraries, which we don’t need and results in increased code size.
We are wary of adding top level options for things like these because they will cause the number of options to massively blow up. One thing which we could do is to update add_project_arguments
to be able to specify which target types they should be used in. So something like:
add_project_link_arguments('--something', languages: 'c', target_type: 'shared_library'')
Would something like this work for you?
Partially linked libraries are another type of libraries along the shared and static ones. They behave differently to the others and have different use.
I see 3 ways to solve this:
By the way, the extra option (in option 2&3) is needed as partial linking isn't just another argument for the linker but another mode, which usually requires different set of arguments. Some of the arguments used for the standard shared library linking model are incompatible with partial model. So, using the add_project_link_arguments
isn't gonna work.
I am proposing to implement option 1.
s needed as partial linking isn't just another argument for the linker but another mode, which usually requires different set of arguments
How are they different? And how does it fail if you just add --partial
or equivalent to the linker line?
As Malhar mentioned, the -shared, -r and -fPIC options should not be added when --partial is specified. This is for ArmCC. Other compilers/linkers might have other requirements.
The thing that concerns me is adding a full top level target type just for this. We try to keep the list of primitives as low as possible so things remain simple and understandable. Exposing the idiosyncrasies of every toolchain as a top level feature is not really sustainable.
I understand that adding more and more stuff to the top level might get out of hand. On the other hand, however partial linked libraries are supported by other than ArmCC (gcc for example) and are useful in certain cases so it might be a good idea to support these. Do you have a proposal for alternative approach?
One possibility is to have a special target_type:
for build_target()
called custom
which will allow you to pass your own arguments for outputting a build target by passing them to c_args:
.
But that does not help if you have a dependency that has a shared_library
call, right? FWICT one of the issues here is propagating the need to shared link into subprojects without changing code in them.
But that does not help if you have a dependency that has a
shared_library
call, right? FWICT one of the issues here is propagating the need to shared link into subprojects without changing code in them.
True, the only thing that can affect that which I can think of is a b_
option. Maybe we can start filtering the options displayed by meson configure
based on whether the configured compiler actually supports it.
Sorry for the delay. I looked more into this, but ran into troubles. I can't seem to be able to get partial linking to work on regular Ubuntu. Something like this:
gcc -Wl,-r -o libflob2.so libfile.c
produces an error:
/usr/bin/ld: -r and -pie may not be used together
Something in Ubuntu's default options (probably) hardcodes pie
and there does not seem to be a linker switch to turn it off.
Another interesting question is whether this is the last library type there is or are there still other, as yet unsupported library types out there? And how are the partially linked libraries named? Do they have a special suffix or do they reuse .so or .a?
gcc -Wl,-r -o libflob2.so libfile.c
produces an error
As you can see when you append -v
, gcc invokes the linker with a ton of other options. Invoking (GNU) ld
directly Just Works4me:
ld -i -o partial.o libfile.o
Maybe incremental linking requires invoking the GNU linker directly? For now, until gcc supports it?
The libraries are just relocatable ELF files – essentially just large .o files
I think it would really remove a lot of confusion to stop calling these "librairies" or - even worse - "shared libraries". Neither http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0474f/CIHGBHCE.html nor https://sourceware.org/binutils/docs-2.32/ld/Options.html calls them "libraries".
Comparing the partial.o
and libfile.o
above with diffoscope shows practically no difference.
Do you need more information on this before we can merge this in?
A runnable "demo" would really help IMHO. References to projects already using incremental linking (with or without Meson) would not hurt either.
https://unix.stackexchange.com/a/495127/28070 compares incremental linking with thin archives which incidentally provides a nice description of what GNU ld -r
does.
Link to new issue "Add a binutils module" #6063
Something in Ubuntu's default options (probably) hardcodes pie
Yes: for security reasons PIE is becoming the new default (from Debian actually)
and there does not seem to be a linker switch to turn it off.
Although this thread seems to be almost dead, I'd like to present a case of real-world usage of partially-linked libraries, or, how they're called in linker docs, "relocatable objects". It's not a very popular technique, and it's difficult to find a good use case for it on desktop systems, but if target does not support shared libraries (== embedded) it is in many cases irreplaceable.
In our project, we release our software to customers in form of relocatable objects whenever shared libs can't be used. Before we've switched to this model, the only other method was a classic static library. That had severe drawbacks:
Relocatable objects solve these issues the following way:
gcc -r
is called, it performs final step of LTO compilation (if LTO was enabled), so the resulting object file contains only machine code, and no compiler-specific IL. It can then be linked by any linker that understands ELF - much like any normal object file. It works the same way when shared libraries are linked, so that you don't get IL in your .so files.In our case, we're using CMake as a build system for legacy reasons. As it has absolutely no support for gcc -r
out of the box, we have to maintain ~500 lines of custom, ugly, toolchain-specific and without any doubt extremely fragile CMake code... It's sad, but no build system that I know of supports pre-linked libraries as a first class target.
If support for this technique is implemented in Meson, it can potentially help a lot of embedded projects which are stuck with static libs as the only option. Though, I don't think it's going to be easy to support it, as toolchains' support for relocatable objects is not really coherent... But, at least GCC and Clang/LLVM are doing a good job here.
As mentioned above this would be a nice thing to support but unfortunately I don't have personal experience with this and, as mentioned above, could not get it to work using plain GCC. We try to avoid using the linker directly as it gets very confusing very fast, not to mention blows up the combination matrix (needing to support n
compilers * m
linkers is not a nice place to be in).
The interesting questions here are things like how this should work. Is a "prelinked library" its own library type? Is it a static library? Something else? As an example based on the link mentioned above, one approach could be this:
ld -r out.o inputs
out.o
Would this work? And more importantly, is it something we could always do by default if supported by the compiler? Is that sufficient for all use cases or would something more be needed?
If static library is built with LTO, its object files are compiled only partially: they do not contain machine code, but rather compiler-specific intermediate language.
That's not what the first lines of https://gcc.gnu.org/onlinedocs/gccint/LTO-Overview.html say. They say "fat" object files contain both.
AFAIK, this requires the same toolchain to be used for linking and initial compilation
From the same page: "a side effect of this feature is that any mistake in the toolchain leads to LTO information not being used [...] This is both an advantage, as the system is more robust, and a disadvantage, as the user is not informed that the optimization has been disabled."
LTO: when gcc -r is called, it performs final step of LTO compilation (if LTO was enabled), so the resulting object file contains only machine code, and no compiler-specific IL.
Fascinating, is this documented somewhere?
LTO tries to address the 50 years old and totally obsolete "translation unit" C concept while pretending to stay compatible with it at the same time....
Is a "prelinked library" its own library type?
No, this just a .o
file. Again, keep using the word "library" only if you want to maximize the confusion and misunderstandings. I found no documentation calling this feature "library", is there any? These are just object files.
In fact using the word "library" is already inconsistent even before looking at this feature because static and shared libraries are totally different files from a build perspective: static libraries are in the 2nd column below while shared libraries are in the 3rd column. While that ship has sailed, the "partial library" confusion ship has not. Yet.
To see the very clear difference between the 2nd and 3rd column simply run "objdump -h" on a few files: .o files, .a files, .so files and executables.
Relocatable Static
Sources Objects, static executables
libraries
---------|----------------|-------------------
a.c ---> a.o
b.c cc b.o
|
| ld --partial
v
big.o
|
| ld --partial
v
bigger.o
|
| concatenation
v
libfoo.a ---> .exe
ld -static
Relocatable Shared objects = "Libraries"
Sources Objects and Executables, both with
a PLT and a GOT
---------|----------------|-----------------------------
a.c ---> a.o
b.c cc b.o
|
| ld --partial
v
big.o
|
| ld --partial
v
bigger.o ---> libfoo.so and .exe files
ld --shared
Note some steps are obviously optional and a real project is unlikely to use all steps.
Ian Lance Taylor's excellent linker crash course: https://lwn.net/Articles/276782/
We try to avoid using the linker directly
That sounds unrealistic for supporting the very wide variety of linker needs in embedded, see old discussions in #6063 and links from there. Compilers keep making linker assumptions that keep breaking the fine linker control often required in embedded.
not to mention blows up the combination matrix (needing to support n compilers * m linkers is not a nice place to be in).
Not sure how to do this exactly but I feel like some kind of "hands off" linker mode could be nice in meson. "hands off" would ideal mean no combination matrix to test and support. Again see #6063
Not sure how to do this exactly but I feel like some kind of "hands off" linker mode could be nice in meson
Yes and no. One thing we have always done in Meson is to solve user problems rather than give them the tools to solve their own problems. The reason for this is that then everyone will solve their own problems in their own ways which are completely different and, usually, incompatible with each other (even though the problems all of them solve are usually pretty much the same). Instead we try to create a good, general solution with tests and all that and put it in main Meson code. In this way everyone can use it.
This is not a perfect approach, as nothing is. It does provide a bit of pain in the short term and can even prevent some people from doing things they need to do. But it really is the only reliable way to maintain coherency in the long term.
The reason for this is that then everyone will solve their own problems in their own ways which are completely different and, usually, incompatible with each other (even though the problems all of them solve are usually pretty much the same)
I agree and even applaud in general but I'm afraid the latter assumption is unfortunately flawed in linking for embedded = where the rubber meets the road. For the simple reason that hardware is always "different" as demonstrated by often complicated linker scripts; I don't think these scripts get written "just for fun". We'll see.
rather than give them the tools to solve their own problems.
Well, realism gave us custom_target
at least (e.g. search #6063 for it) but I guess you don't want something in some poorly defined "middle" between 1. completely generic and agnostic custom_target
and 2. the one well-defined true way to do something.
Sticking with the technical side for the moment. Suppose we have a project that builds one exe (or firmware or what have you). The code is split to 10 different static convenience libraries each with 10 files (for simplicity). What would be the correct way to prelink these?
sourceXX.c
, compile to sourceXX.o
then prelink each individually to sourcexx-prelinked.o
, put all of these in a static library and proceed as usualsourceXX.c
compile to sourceXX.o
then prelink all these to one stlib-foo-prelinked.o
, put it in a static library and proceed as usualfor each sourceXX.c, compile to sourceXX.o then prelink each individually to sourcexx-prelinked.o, put all of these in a static library and proceed as usual
I don't see why you would want to pre-link a single .o
file with itself. You can technically do this but I don't see what purpose it would achieve in a real project.
for each sourceXX.c compile to sourceXX.o then prelink all these to one stlib-foo-prelinked.o, put it in a static library and proceed as usual
I don't see either why you would want to put a stlib-foo-prelinked.o
(or any .o file really) in a .a
container with a single .o
file.
Note I tried to picture all possibilities In my ASCII art above but some steps are optional and a real project is unlikely to use all steps.
I don't see either why you would want to put a stlib-foo-prelinked.o (or any .o file really) in a .a container with a single .o file.
Because in Meson, object files never stand on their own. They are always tied to a build target (shared or static lib, module or exe). Yes, there may be a minor performance penalty for creating an .a
file with just one .o
file but it is almost always neglible.
for each sourceXX.c compile to sourceXX.o then prelink all these to one stlib-foo-prelinked.o, put it in a static library and proceed as usual
This one is probably the closest to the described use cases. Speaking of which:
Those subsystems may contain duplicated symbols (e.g. each subsystem might have its own main() function, malloc() function and a few others). We need to link each subsystem together, resolving all internal references;
Namespace separation: after we pre-link multiple static libraries / objects into one relocatable .o file, nothing prevents us from marking internal symbols as LOCAL, as they're never used outside of this .o file.
Both of these sound like "static" re-implementations of dllexport
/visibility
. https://gcc.gnu.org/wiki/Visibility
So maybe that's all what --partial
(a.k.a. -r
a.k.a. -i
) is after all: the static equivalent of dllexport
.
@marc-h38
That's not what the first lines of https://gcc.gnu.org/onlinedocs/gccint/LTO-Overview.html say. They say "fat" object files contain both.
That's true, but do you really want to release fatLTO objects to 3rd-party customers? Personal opinion follows. I don't like fatLTO, as it works along the lines of "if you're lucky you will get optimized code, and if your toolchain is a bit off, you will not, and maybe we'll also have to debug customer's toolchain issues in addition to our own bugs". -r
with proper options (see below) provides a viable alternative.
Fascinating, is this documented somewhere?
Ok, I've missed one detail here. This is controlled by -flinker-output
compiler driver option (GCC), documented at https://gcc.gnu.org/onlinedocs/gcc/Link-Options.html . This option is a bit tricky, as it has different behaviors across different GCC versions. Based on my observations with GCC 8, when default value is used (appears to be rel
), the output is the same as with nolto-rel
in GCC 9+, i.e. it compiles LTO IL into machine code. For GCC 9+, you need to feed it -flinker-output=nolto-rel
.
@jpakkane
We try to avoid using the linker directly
I don't think that's needed. Also, if you want linker plugin to take care of LTO it's much more practical to call it via a compiler driver. GCC's and Clang's compiler drivers support -r
option, and in the simplest case no options need to be passed to the linker directly. However, if, for example, you want to have static libraries as inputs to -r
step, most probably you'll need to pass some linker-specific options such as -Wl,--whole-archive
.
could not get it to work using plain GCC
Please check two following scripts.
This one demonstrates pure gcc -r
: https://pastebin.com/U3A21Lvy
And this one adds LTO to it: https://pastebin.com/n9EnqKkK (verified with GCC 9, for GCC 8 you'd most probably need to remove -flinker-output
option or set it to rel
)
Bringing together all the threads it would seem that something like the following should do what people expect it to:
prelink
to static_library
which can be either true
or false
prelink_args
for specifying extra arguments?Does this sound reasonable? Are there use cases which are not covered by this?
That's true, but do you really want to release fatLTO objects to 3rd-party customers?
No, but now I think this entire discussion about intermediate representations and fat objects was just a distraction that obscures your use case. Please correct me but I think your use case is much more simply this: you want LTO but you cannot ask your customer to use the exact same toolchain as you. So partial linking is a great opportunity for you to perform LTO earlier, on your site with your toolchain: the exact same toolchain that compiled your sources. End of use case, everything else is unimportant implementation details. Of course this means LTO is performed only across your own objects and not considering your customer's objects but that's much better than hit-and-miss LTO because of random toolchain incompatibilities.
we add a new kwarg prelink to static_library which can be either true or false
Can you confirm the above + extract_all_objects()
would let you chain partial linking as many times as desired?
This was not explicit in any of the use cases but it wasn't excluded either and I think it's a good sanity check to verify that meson would not stop you from doing something that you can very simply do from the command line.
Can you confirm the above + extract_all_objects() would let you chain partial linking as many times as desired?
That should work in some way. There's the question of whether extract objects should provide the original un-prelinked objects or the prelinked one
@marc-h38 Yes, correct. That was already described in my first message in this discussion, and I'm sorry if that wasn't clear enough ;-)
Can you confirm the above + extract_all_objects() would let you chain partial linking as many times as desired?
Shouldn't simply making prelinked library a dependency of another prelinked library already result in chaining? (By chaining I mean gcc -r
invoked on result of another gcc -r
)
@jpakkane That sounds good to me. Also, some care needs to be taken w.r.t. dependency propagation: the dependents of prelinked static libraries should not be linked with prelinked library's own dependencies (I assume that's how it works when e.g. executable is linked to a normal static library which depends on another static library... at least in CMake it's like that).
Shouldn't simply making prelinked library a dependency of another prelinked library already result in chaining? [...]
That would make sense. Sorry I forgot you may want static lib->static lib dependencies for compilation and header files. Cause these don't make sense at link time:
the dependents of prelinked static libraries should not be linked with prelinked library's own dependencies
Do you mean: while the tree of static dependencies is "flatten" at link time for regular static libraries, it should not be flatten for pre-linked static "libraries"? Good catch!
The more differences I keep seeing between static, shared and pre-link "libraries", the more I find that the concept of a common "library" super class is broken. It feels like having a super class for cars and houses because they both have doors.
There is no library superclass as such. library()
is just shorthand for building either of the two (or both at the same time).
A preliminary version is now in #7983, please add your comments there.
I'm closing this PR in favor of the new one. I'd like to ask everyone who care about the issue to check it out and report if it has any problems or if there are use cases it is not covering. Once merged it becomes harder and harder to change.
In case you didn't notice (there are only 3 people total subscribed to it), the implementation in #7983 was just merged... without any comment whatsoever from the people who are actually using this compiler feature and were very active on this page. Not even a thumbs up, strange! I feel like it's close enough to what their use cases require but only they can tell for sure.
Added a new builtin-option called 'sharedlib-linkmodel' for being able to generate a partial linked object using a shared_library() target.