llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.33k stars 12.13k forks source link

[LLD] Adding a modern target-selection flag to the drivers #97124

Closed Ericson2314 closed 4 months ago

Ericson2314 commented 5 months ago

LLD's frontends are currently very faithful to the linkers they are based on. But that means the target-selection mechanisms they have are rather underpowered. I think would be good to have modern flags that would allow us to set the EM_* choice, ELFOSABI_* choice, and ELFKind independently (for valid combinations)

A good use for this would be better handling of the "OSABI" field. For example:

Problems

FreeBSD existing hacks

https://github.com/llvm/llvm-project/blob/9572388a849c494d45df334f9facd8ee6663953f/lld/ELF/Driver.cpp#L174-L177 is an ad-hoc hack for FreeBSD. The corresponding code in Clang to use it is even uglier:

https://github.com/llvm/llvm-project/blob/9572388a849c494d45df334f9facd8ee6663953f/clang/lib/Driver/ToolChains/FreeBSD.cpp#L174-L218

If Clang could use "regular" code to transform the CPU into a -m flag, and then separately tell LLD that the ELFOSABI_* is ELFOSABI_FREEBSD, that would be much cleaner.

OpenBSD has similar needs

As discussed in https://github.com/llvm/llvm-project/pull/92675, we ought to have CI for OpenBSD, but OpenBSD has some outstanding downstream changes that need to be upstreamed before upstream-tool-produced binaries will work, and many of those changes today assume OpenBSD->OpenBSD native compilation and so are unfit to upstream as is

https://github.com/llvm/llvm-project/pull/97122 is the first such patch I've rebased. This is a somewhat borderline case, as there is already blanket handling to .openbsd.random regardless of the ELFOSABI_* in use. Still, "stealing names" from all ELF usages of LLD doesn't seem very elegant, even if the .openbsd.random case is grandfathered in --- I much rather start requiring ELFOSABI_OPENBSD with the .openbsd.random case deprecated with a warning. That said, if doing this would require a _obsd hack like FreeBSD's _fbsd, I can't help but think the medicine is as almost as bad is the disease.

Solutions

Proposal A: --target flag

Add a flag for "regular" LLVM triples --- like Clang's --target or LLVM's -mtriple --- so we have more expressive power. Those triples should be mostly possible to map to the choices above, and -m flags could still be used to fill in the gaps.

Advantages:

Disadvantages:

LLVM triples can both say to much and too little

Proposal B: New greenfield flags

Add multiple new flags for specifying the these parts independently. Certainly ELFOSABI_ needs one. -m does the EM_ and ELFKind residual alright, perhaps, or perhaps they get fresh new flags too.

Advantages: No syntax vs semantics mismatch / friction / corner cases.

Disadvantage: Greenfield new flags for other tooling to have to learn about.


CC @brad0 because OpenBSD

C @mstorsjo because I am curious if Windows stuff has similar needs / not sure what to do about lld-link (clang-cl takes --target and many other GNU-style flags, but lld-link only takes /flag MS-style-flags)

llvmbot commented 5 months ago

@llvm/issue-subscribers-lld-elf

Author: John Ericson (Ericson2314)

LLD's frontends are currently very faithful to the linkers they are based on. But that means the target-selection mechanisms they have are rather underpowered. I think would be good to have modern flags that would allow us to set the `EM_*` choice, `ELFOSABI_*` choice, and `ELFKind` independently (for valid combinations) A good use for this would be better handling of the "OSABI" field. For example: ## Problems ### FreeBSD existing hacks https://github.com/llvm/llvm-project/blob/9572388a849c494d45df334f9facd8ee6663953f/lld/ELF/Driver.cpp#L174-L177 is an ad-hoc hack for FreeBSD. The corresponding code in Clang to use it is even uglier: https://github.com/llvm/llvm-project/blob/9572388a849c494d45df334f9facd8ee6663953f/clang/lib/Driver/ToolChains/FreeBSD.cpp#L174-L218 If Clang could use "regular" code to transform the CPU into a `-m` flag, and then separately tell LLD that the `ELFOSABI_*` is `ELFOSABI_FREEBSD`, that would be much cleaner. ### OpenBSD has similar needs As discussed in https://github.com/llvm/llvm-project/pull/92675, we ought to have CI for OpenBSD, but OpenBSD has some outstanding downstream changes that need to be upstreamed before upstream-tool-produced binaries will work, and many of those changes today assume OpenBSD->OpenBSD native compilation and so are unfit to upstream as is https://github.com/llvm/llvm-project/pull/97122 is the first such patch I've rebased. This is a somewhat borderline case, as there is already blanket handling to `.openbsd.random` regardless of the `ELFOSABI_*` in use. Still, "stealing names" from all ELF usages of LLD doesn't seem very elegant, even if the `.openbsd.random` case is grandfathered in --- I much rather start requiring `ELFOSABI_OPENBSD` with the `.openbsd.random` case deprecated with a warning. That said, if doing this would require a `_obsd` hack like FreeBSD's `_fbsd`, I can't help but think the medicine is as almost as bad is the disease. ## Solutions ### Proposal A: `--target` flag Add a flag for "regular" LLVM triples --- like Clang's `--target` or LLVM's `-mtriple` --- so we have more expressive power. Those triples should be mostly possible to map to the choices above, and `-m` flags could still be used to fill in the gaps. Advantages: - "no new syntax" - tools like Clang can just forward their `--target` argument as-is and hope for the best Disadvantages: LLVM triples can both say to much and too little ### Proposal B: New greenfield flags Add multiple new flags for specifying the these parts independently. Certainly `ELFOSABI_` needs one. `-m` does the `EM_` and `ELFKind` residual alright, perhaps, or perhaps they get fresh new flags too. Advantages: No syntax vs semantics mismatch / friction / corner cases. Disadvantage: Greenfield new flags for other tooling to have to learn about. ---- CC @brad0 because OpenBSD C @mstorsjo because I am curious if Windows stuff has similar needs / not sure what to do about `lld-link` (`clang-cl` takes `--target` and many other GNU-style flags, but `lld-link` only takes `/flag` MS-style-flags)
MaskRay commented 5 months ago

We can do something more lightweight. Does #97144 work for you? OpenBSD can make all relocatable files tagged with ELFOSABI_OPENBSD. It could also ensure that Scrt1.o/crtbeginS.o are tagged.

While OpenBSD has proposed many interesting ideas for security hardening, I am deeply concerned of its development practices and the rapid pace of introducing questionable extensions. Particularly on the object file format side, the frequent addition of new .openbsd.* sections and PT_OPENBSD_* program headers raises my eyebrows. In getOutputSectionName, the number of prefixes matters for performance.

I hope that there is special support to merge .openbsd.xxx.$unique into .openbsd.xxx. Just define .openbsd.xxx instead of .openbsd.xxx.$unique.

mstorsjo commented 5 months ago

because I am curious if Windows stuff has similar needs

Not in particular, I think. The main distinction, probably somewhat similar to the target OS of an ELF object, is about whether it targets the mingw or MSVC ABI. And we already handle that in LLD, by using two separate entry points (ld.lld with a windows -m parameter, or lld-link), and the mingw entry point invokes lld-link with the parameter -lldmingw.

not sure what to do about lld-link (clang-cl takes --target and many other GNU-style flags, but lld-link only takes /flag MS-style-flags)

I don't think it's needed, but you could invent a lld-link style spelling of it, e.g. /target:<triple>. If you'd have Clang passing it automatically, that should only be done when we know the linker is lld-link and not plain link.exe. But in practice, when linking with lld-link, you seldom use the compiler driver (clang) to invoke the linker, but the user or build system most often invokes lld-link directly. So in that case, we wouldn't implicitly be getting the target triple for free anyway, unless we teach all build systems to do it.

But as said above, I don't see a direct need for it at all.

Ericson2314 commented 5 months ago

We can do something more lightweight. Does https://github.com/llvm/llvm-project/pull/97144 work for you?

Yes, I think it does! Assuming Clang already sets that or can easily be made to do so, which I think is true.

@mstorsjo Thanks for the feedback. It's nice that @MaskRay's solution sidesteps the need for a new flag. clang-cl already takes --target so I think we are good here!

Ericson2314 commented 4 months ago

Closing because #97144 does work for me :)