Open marxin opened 2 years ago
Hm, looks like our symbol resolution is going wild. Checking.
The use of common symbols seems to be the culprit: Right now in mold, COMMON symbols are resolved as if they are undefined. Hence it triggers the needed heuristic.
I'm not entirely sure this is something worth tackling since the behavior of COMMON is rather underspecified and this is unlikely to cause problem in practice; although I do agree making an SO extraction doesn't make much sense here.
There's also another unrelated problem where copyrel symbols would have proper values in .dynsym but not in .symtab due to the timing their contents are calculated. This is what confused me in the beginning, but can be fixed separately.
It looks like the origin here is:
When a symbol is COMMON and ld sees an archive, ld checks whether the archive index provides a STB_GLOBAL definition of the symbol. If yes, ld extracts the archive as well. This is in contrary to the usual rule that only an undefined symbol leads to archive member extraction.
https://maskray.me/blog/2022-02-06-all-about-common-symbols#linker-behavior
And mold by design treats lazy archives and as-needed SOs the same, which explains the behavior you're seeing. So the rules around here is really a combination of legacy matters, and since modern toolchains no longer emit common unless asked to, the only reason to care about this is when some legacy application is relying on it. Otherwise, I'm inclined to say that this is just an implementation-defined behavior.
Reduced from binutils test-suite:
a.c:
b.c:
The later on is BFD.