seL4 / sel4runtime

A minimal runtime for seL4 applications.
Other
12 stars 29 forks source link

Do not define __sysinfo twice #3

Closed sorear closed 4 years ago

sorear commented 4 years ago

It is already defined in musllibc and defining it here results in multiple definition errors with clang 11. Will need to be revisited when musllibc is removed.

xurtis commented 4 years ago

sel4runtime is intended not to depend on external libc as much as possible and to be explicitly incompatible with external runtimes. When using sel4runtime the definition of __sysinfo from musllibc shouldn't be included.

sorear commented 4 years ago

So you're proposing to drop this and patch musllibc instead?

xurtis commented 4 years ago

I'm wondering why you're observing this issue at this point, it sounds like components of musllibc are being linked into images where they shouldn't be.

sorear commented 4 years ago

sel4test uses at least the following symbols which pull in libc.o (and __sysinfo) directly or indirectly: printf, puts, malloc, free, calloc, vsnprintf, realloc, stdout, fwrite, stderr, fflush, snprintf

xurtis commented 4 years ago

So it does seem that a large portion of the object files in musllibc include "libc.h" which adds the extern declaration for __sysinfo. I can see why this would create an issue but am now confused as to why this is only now producing errors.

In any case, maybe it would be better to add an extern declaration. Will need to check that this doesn't interfere with our other projects (if this were to cause an issue with anything else I would think that it would be rumprun).

sorear commented 4 years ago

I can't quite interpret that. Object files don't include header files, they have defined and undefined symbols. Large parts of the sel4 library ecosystem have "malloc" as an undefined symbol; this is satisfied by pulling in malloc.o; malloc.o references libc (from "libc.threads_minus_1" in the source), and the libc symbol is satisfied by the libc.o object, which also includes __sysinfo.

I'm not clear on which combinations of sel4runtime without musllibc, and musllibc without sel4runtime, need to be supported here. We can make musl not define __sysinfo in libc.o, but then projects that don't include sel4runtime will have a problem.

xurtis commented 4 years ago

Sorry, my comment wasn't particularly clear. A large number of the .c files that are compiled into those object files include libc.h which appears to have the external definition of __sysinfo along with other symbols provided by libc.o. This should already be causing a conflict between the symbol in sel4runtime and musllibc, so I'm surprised that it's only in the latest version of clang that this occurs.

It seems that GCC 8 (and perhaps other versions of GCC and older versions of clang) omit external declarations if the code in an object file doesn't depend on them. This would mean that the object files that were being used didn't depend on __sysinfo or depended on __sysinfo but not any of the other symbols from libc.o, allowing libc.o to be skipped when sel4runtime had already provided the __sysinfo symbol.

I'm happy to approve this change if it passes tests, all of the other projects we maintain all still use musl at this point.

xurtis commented 4 years ago

@ssrg-bamboo test

ssrg-bamboo commented 4 years ago

Hello, I'm a bot! I'll bring this PR into Trustworthy Systems and run some tests

ssrg-bamboo commented 4 years ago

All the tests we ran have passed! Nice job!

jrtc27 commented 4 years ago

Sorry, my comment wasn't particularly clear. A large number of the .c files that are compiled into those object files include libc.h which appears to have the external definition of __sysinfo along with other symbols provided by libc.o. This should already be causing a conflict between the symbol in sel4runtime and musllibc, so I'm surprised that it's only in the latest version of clang that this occurs.

Having the extern definition (strictly declaration not definition as the two have different meanings in this context) poses no issue. It does nothing other than instruct the compiler that the symbol can be assumed to exist somewhere at link/run time if needed. The conflict is because there were two conflicting definitions at link time (ie non-extern non-static global variables). Clang 10 and below, and GCC 9 and below, default to -fcommon, which means that zero-initialised globals can be duplicated and will be merged at link time, but this can mask bugs and so Clang 11 and GCC 10 have changed their default to -fno-common, uncovering dodgy code across the entire open-source world.

It seems that GCC 8 (and perhaps other versions of GCC and older versions of clang) omit external declarations if the code in an object file doesn't depend on them.

That's to be expected; extern merely makes the symbol available, but if nothing's using it then it never gets pulled in, otherwise everything would be a mess. Just think of how many symbols are declared in all the headers a typical C file transitively includes, let alone C++.

This would mean that the object files that were being used didn't depend on __sysinfo or depended on __sysinfo but not any of the other symbols from libc.o, allowing libc.o to be skipped when sel4runtime had already provided the __sysinfo symbol.

No, it implies nothing about that, as -fcommon hid the bug. It could have linked in libc.o and merged the two, or it could have happened to not need libc.o in certain cases. You just don't know either way.

I'm happy to approve this change if it passes tests, all of the other projects we maintain all still use musl at this point.

xurtis commented 4 years ago

I was struggling to find documentation regarding how the linker handled this case. This would explain some of linker issues that have popped up elsewhere as well. At this juncture, I'm still happy with the change, particularly if it means that we no longer rely on the merging particular behaviour.