Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

error: -shared and -pie may not be used together #38782

Closed Quuxplusone closed 5 years ago

Quuxplusone commented 5 years ago
Bugzilla Link PR39810
Status RESOLVED FIXED
Importance P enhancement
Reported by Nick Desaulniers (ndesaulniers@google.com)
Reported on 2018-11-27 11:08:31 -0800
Last modified on 2018-12-10 10:06:30 -0800
Version unspecified
Hardware PC Linux
CC grimar@accesssoftek.com, llozano@chromium.org, llvm-bugs@lists.llvm.org, ruiu@google.com, smithp352@googlemail.com, srhines@google.com
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also

The Linux kernel for arm64 uses -pie -shared for LDFLAGS when using KASLR (kernel address space layout randomization). Trying to link an arm64 Linux kernel with CONFIG_RANDOMIZE_BASE=y or CONFIG_RELOCATABLE=y produces the following error from lld:

ld.lld: error: -shared and -pie may not be used together

Quuxplusone commented 5 years ago

-pie is short for -pic-executable, and with -shared the linker does not create an executable but a shared object file. That's why they conflict.

I believe you can simply remove -pie.

-pie creates a position-independent executable. By default, shared objects are position-independent. So, it is likely that -pie is just superfluous.

Quuxplusone commented 5 years ago

With -pie removed but -shared kept, I observe many errors in the form:

ld.lld: error: can't create dynamic relocation R_AARCH64_ABS64 against symbol: in readonly segment; recompile object files with -fPIC or pass '-Wl,-z,notext' to allow text relocations in the output

Testing with -z notext -shared: Gets further along, but other errors prevent a complete link (unsure if they're related or not). Ex:

ld.lld: error: relocation cannot refer to absolute symbol:

for relocation types:

I have not tested what the the s/-pie/-z notext/ change does for bfd.

Testing with -pie kept but -shared removed: Creates the same error above: ld.lld: error: can't create dynamic relocation R_AARCH64_ABS64 against symbol: in readonly segment; recompile object files with -fPIC or pass '-Wl,-z,notext' to allow text relocations in the output

Quuxplusone commented 5 years ago

Kernel commit b9dce7f1ba01 ("arm64: kernel: force ET_DYN ELF type for CONFIG_RELOCATABLE=y") provides more context as to the addition of -shared. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/arm64/Makefile?id=b9dce7f1ba01be340975c17bd37a46ec6054bd2b

Kernel commit 1e48ef7fcc37 ("arm64: add support for building vmlinux as a relocatable PIE binary") introduced -pie. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/arm64/Makefile?id=1e48ef7fcc374051730381a2a05da77eb4eafdb0

Kernel commit 08cc55b2afd9 ("arm64: relocatable: suppress R_AARCH64_ABS64 relocations in vmlinux") added -Bsymbolic (looks related and is used with -pie -shared -Bsymbolic currently. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/arm64/Makefile?id=08cc55b2afd97a654f71b3bebf8bb0ec89fdc498

Quuxplusone commented 5 years ago
Better links:

-fpie:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1e48ef7fcc374051730381a2a05da77eb4eafdb0

-Bsymbolic:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=08cc55b2afd97a654f71b3bebf8bb0ec89fdc498

-shared:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b9dce7f1ba01be340975c17bd37a46ec6054bd2b
Quuxplusone commented 5 years ago

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b9dce7f1ba01be340975c17bd37a46ec6054bd2b

This change says that a debugger cannot handle relocated executables, but if that's true, isn't that a bug in the debugger? Position-independent executables do exist, and they should be ET_EXEC because they are executable file.

Quuxplusone commented 5 years ago

I think that at this stage it is reasonable for LLD to give an error message for -pie and -shared at the same time as it isn't clear what that combination means; link as -pie but set the e_type to ET_DYN, or link as -shared but set e_type to ET_EXEC? The former seems like it could produce shared objects that might not work in some cases due to assumptions made by -pie like using executable only TLS sequences. The latter should work but would result in a less efficient binary.

It is true that -bsymbolic -shared should generate something that looks very similar to what -fpie would generate in terms of PLT and GOT generation. I think that there are subtle differences but they probably wouldn't affect the linux kernel.

I think that it is likely that tool support in linkers and debuggers for PIE has improved over time and I suspect that modern versions of GDB can handle ET_EXEC PIE executables without problems although I'll need to double check that.

What is perhaps more concerning are the relocation error messages with -pie. In theory if all inputs are compiled -fpie this shouldn't happen. There is usually some bit of (possibly inline) assembly that causes the error. LLD and ld.bfd have different defaults for -zdefs so it may be that this isn't reported by ld.bfd.

I will try and reproduce the problem on an AArch64 machine and will report back. I'll also check what -pie and -shared mean in ld.bfd and whether it is something that LLD ought to emulate.

Quuxplusone commented 5 years ago
Apologies in my previous comment when I said the default value of -zdefs was
different, I meant the default for -ztext is different. For lld it is -ztext
and for ld.bfd I believe it is -znotext.

I've been able to reproduce the error and I am looking at it now. With -znotext
and removing --shared I get a smaller number of errors like:
ld.lld: error: relocation R_AARCH64_ADR_PREL_PG_HI21 cannot refer to absolute
symbol: __efistub__text
>>> defined in <internal>
>>> referenced by arch/arm64/kernel/efi-entry.stub.o:(__efistub_entry) in
archive built-in.a

This is essentially saying no relative relocations to an absolute symbol. In
general this is the correct thing to do as when the place of the relocation
changes due to the executable being loaded at a different address the offset
will change. I'll need to track down where the reference and definition are
coming from. The error message implies it is linker created but I've not found
out from where yet.

I also see some errors about not allowing the dynamic section and a few other
to be discarded. I can't find a good reason for this. The source code commit
says:
"r305613 | rafael | 2017-06-17 00:45:35 +0100 (Sat, 17 Jun 2017) | 3 lines

Error on trying to discard .dynamic.

We would crash instead before."

It may be that support for this hasn't been implemented as it hasn't been
needed yet.
Quuxplusone commented 5 years ago
(In reply to Peter Smith from comment #7)
> I'll need to track down where the reference and definition are
> coming from. The error message implies it is linker created but I've not
> found out from where yet.

My bet is it is coming from the linker script.

> It may be that support for this hasn't been implemented as it hasn't been
needed > yet.

As far I remember we were crashing when some special sections were discarded.
IT was easier to ban the discarding sometimes and I think that is why we did it
with the several sections initially.
Quuxplusone commented 5 years ago
Yes the symbols are coming from the linkerscript. I got confused as I was
looking for a symbol __efistub_entry which I couldn't find, but I should have
been looking for __efistub_text.

The relevant part of the linkerscript is:

__efistub_stext_offset = stext - _text;
__efistub_memcmp = ABSOLUTE(__pi_memcmp);
__efistub_memchr = ABSOLUTE(__pi_memchr);
__efistub_memcpy = ABSOLUTE(__pi_memcpy);
__efistub_memmove = ABSOLUTE(__pi_memmove);
__efistub_memset = ABSOLUTE(__pi_memset);
__efistub_strlen = ABSOLUTE(__pi_strlen);
__efistub_strnlen = ABSOLUTE(__pi_strnlen);
__efistub_strcmp = ABSOLUTE(__pi_strcmp);
__efistub_strncmp = ABSOLUTE(__pi_strncmp);
__efistub_strrchr = ABSOLUTE(__pi_strrchr);
__efistub___flush_dcache_area = ABSOLUTE(__pi___flush_dcache_area);
__efistub__text = ABSOLUTE(_text);
__efistub__end = ABSOLUTE(_end);
__efistub__edata = ABSOLUTE(_edata);

I'll need to check ld.bfd behaviour with respect to relative relocations to
absolute symbols.
Quuxplusone commented 5 years ago
I've checked that with ld.bfd -ztext will fail the link with
ld: read-only segment has dynamic relocations.

It doesn't matter if -shared is used with or without -fpie here.

To summarise where we are:
- Assuming that the read-only segment dynamic relocations are intentional it
looks like -znotext will be needed to link with lld.
- The error for relative relocations to absolute linker defined symbols needs
some more investigation about whether ld.bfd just doesn't have this error
message at all, or if it special cases linker defined symbols.
- The error for discarding the linker created sections needs some more
investigation to see if it can be made safe enough to remove the error message.
Quuxplusone commented 5 years ago
I've checked over ld.bfd behaviour and the Linux kernel to try and explain the
remaining differences.

First, ld.bfd has no error checking at all for relative relocations to absolute
symbols. In this particular case it doesn't matter as the symbols are section
relative but it would be possible to write an example that would fail, most
likely seen in a bare metal embedded context.

The remaining absolute symbols are ostensibly coming from image.h
__efistub_stext_offset = stext - _text;

/*
 * Prevent the symbol aliases below from being emitted into the kallsyms
 * table, by forcing them to be absolute symbols (which are conveniently
 * ignored by scripts/kallsyms) rather than section relative symbols.
 * The distinction is only relevant for partial linking, and only for symbols
 * that are defined within a section declaration (which is not the case for
 * the definitions below) so the resulting values will be identical.
 */
#define KALLSYMS_HIDE(sym)      ABSOLUTE(sym)

/*
 * The EFI stub has its own symbol namespace prefixed by __efistub_, to
 * isolate it from the kernel proper. The following symbols are legally
 * accessed by the stub, so provide some aliases to make them accessible.
 * Only include data symbols here, or text symbols of functions that are
 * guaranteed to be safe when executed at another offset than they were
 * linked at. The routines below are all implemented in assembler in a
 * position independent manner
 */
__efistub_memcmp                = KALLSYMS_HIDE(__pi_memcmp);
...
__efistub__text                 = KALLSYMS_HIDE(_text);
__efistub__end                  = KALLSYMS_HIDE(_end);
__efistub__edata                = KALLSYMS_HIDE(_edata);

Curiously if I remove the ABSOLUTE then I still get the error for relocations
to __efistub__text, __efistub__end and __efistub__edata which are defined in
the linker script

 .head.text : {
  _text = .;
  KEEP(*(.head.text))
 }

It seems that if the linker defined _text, which is section relative (not
absolute) is referred to directly with a relative relocation all is well, but
if __efistub__text is refered to with a relative relocation we get an error
message. It seems like __efistub__text                 = _text; loses the
section relative property of _text.

To summarise:
1.) The bulk of the ABSOLUTE symbols are coming from image.h. This doesn't make
a difference with ld.bfd as it has no error checking.
2.) LLD can lose the section relative property of linker defined symbols when
assigning.

I think 2. is a bug in LLD which we should fix. I'm not entirely sure what to
do about 1. Strictly speaking ABSOLUTE should not be used here as it has lost
information that the linker needs to know to safely use a relative relocation.
I don't know what alternative the kernel can use to satisfy the comment about
kallsyms though. It could be possible to add an option to lld to not error on
relative relocations to linker generated absolute symbols but this seems like a
hack for compatibility with ld.bfd.
Quuxplusone commented 5 years ago
(In reply to Peter Smith from comment #7)
> I also see some errors about not allowing the dynamic section and a few
> other to be discarded. I can't find a good reason for this. The source code
> commit says:
> "r305613 | rafael | 2017-06-17 00:45:35 +0100 (Sat, 17 Jun 2017) | 3 lines
>
> Error on trying to discard .dynamic.
>
> We would crash instead before."
>
> It may be that support for this hasn't been implemented as it hasn't been
> needed yet.

I also see these, for a few sections listed.  It seems these are coming from a
DISCARD section in a linker script (arch/arm64/kernel/vmlinux.lds.S):

        /DISCARD/ : {
                ARM_EXIT_DISCARD(EXIT_TEXT)
                ARM_EXIT_DISCARD(EXIT_DATA)
                EXIT_CALL
                *(.discard)
                *(.discard.*)
                *(.interp .dynamic)
                *(.dynsym .dynstr .hash)
        }

LLD complains about discarding .dynamic, .dynsym, and .dynstr.  Maybe those are
required when compiling as -shared?  Should I file a separate bug about those?
Quuxplusone commented 5 years ago
(In reply to Nick Desaulniers from comment #12)
> (In reply to Peter Smith from comment #7)
> > I also see some errors about not allowing the dynamic section and a few
> > other to be discarded. I can't find a good reason for this. The source code
> > commit says:
> > "r305613 | rafael | 2017-06-17 00:45:35 +0100 (Sat, 17 Jun 2017) | 3 lines
> >
> > Error on trying to discard .dynamic.
> >
> > We would crash instead before."
> >
> > It may be that support for this hasn't been implemented as it hasn't been
> > needed yet.
>
> I also see these, for a few sections listed.  It seems these are coming from
> a DISCARD section in a linker script (arch/arm64/kernel/vmlinux.lds.S):
>
>         /DISCARD/ : {
>                 ARM_EXIT_DISCARD(EXIT_TEXT)
>                 ARM_EXIT_DISCARD(EXIT_DATA)
>                 EXIT_CALL
>                 *(.discard)
>                 *(.discard.*)
>                 *(.interp .dynamic)
>                 *(.dynsym .dynstr .hash)
>         }
>
> LLD complains about discarding .dynamic, .dynsym, and .dynstr.  Maybe those
> are required when compiling as -shared?  Should I file a separate bug about
> those?

For a non-statically linked application or shared library that will be loaded
by something like ld.so I think the .dynamic section is be needed as that tells
it where to find the relocations, symbols etc. The symbol table is not strictly
essential (not all relocations need symbols) but almost all will.

In the case of the kernel, which will do its own loading, and doesn't need to
find any symbols, it looks like it is encoding the information it needs to find
the relocations in the linker script, probably:
__rela_offset = ABSOLUTE(ADDR(.rela) - ((((((0xffffffffffffffff)) - (((1)) << \
(39)) + 1) + (0)) + (0x08000000))));
__rela_size = SIZEOF(.rela);

So I think it is legitimate to discard those sections for the kernel and
embedded systems that can load themselves.

Overall I think there are three issues:
1.) Ability to discard more of the linker created sections.
2.) A symbol assignment to symbol defined in a section description in a linker
script can lose the section information.
3.) LLD is giving an error message for relative relocations to absolute
locations and ld.bfd doesn't and the kernel is relying on that at the moment
for kallsyms.

I think 1.) and 2.) are definite LLD problems and could certainly go in
separate PRs as this one is getting a bit too long. I'm not sure about 3.) it
can be worked around with a change in LLD but there may be a better solution in
the kernel.

I think -pie and -shared together is most likely not a problem and -pie should
be used on its own instead. I've not got round to checking gdb yet.
Quuxplusone commented 5 years ago
As to __efistub_*, I wonder if there is a way to write code that does the same
thing without linker script or with a linker script that doesn't exercise
corner cases like that.
Quuxplusone commented 5 years ago
I've taken a look at the ET_DYN and ET_EXEC issue. The situation is a bit
messier than I'd like. One thing I'd assumed from the context was the -pie
executables always have e_type ET_EXEC but this is not the case, they usually
have type ET_DYN.

gdb seems to only relocates the executable's symbols and debug information when
the e_type is ET_DYN and not when the type is ET_EXEC.

LLD when -pie or -shared is used always sets the ELF e_type to ET_DYN.

ld.bfd when -pie is used sets the ELF e_type to ET_DYN unless the address of
the text segment is non 0. This appears to be due to:
https://sourceware.org/ml/binutils/2013-12/msg00081.html it seems like the
theory is that if you set a non 0 text segment starting address you didn't want
the kernel to randomize the address anyway. It seems like it was some way of
getting the small code model to execute at a fixed address on x86.

In ld.bfd the linker's command line processing for -pie just sets the
link_info.type = type_pie. The -shared will set link_info.type = type_dll and
allow symbols to be undefined.

To summarise:
- In LLD it doesn't matter whether -pie or -shared is used the ELF file gets
ET_DYN.
- In ld.bfd the -shared is effectively overriding the -pie, and the -bsymbolic
is effectively making the linker behave as if it were -pie for the purposes of
relocation, PLT and GOT generation.

It seems like for ld.bfd to get ET_DYN you are going to need to keep using -
shared and -bsymbolic. This should behave the same way on LLD. I don't think
you need -pie as I think it is being overridden by -shared.
Quuxplusone commented 5 years ago
Re: __efistub_*, Ard has posted patches removing them:
http://lists.infradead.org/pipermail/linux-arm-kernel/2018-November/616512.html

Speaking with Ard via email, he suggested that we can remove -pie and -shared
from the kernel, so the details about compatibility with binutils for this
special case are moot.

It sounds then the final issue is allowing the discarding of .dynsym, .dynamic,
and .dynstring sections in LLD, which I suppose is feasible?
Quuxplusone commented 5 years ago

I'm not sure if it is easy to discard .dynsym, .dynamic, and .dynstring safely, because these sections get a special treatment in the linker, but theoretically, yes, they should be discardable.

Quuxplusone commented 5 years ago
(In reply to Peter Smith from comment #11)
> I've checked over ld.bfd behaviour and the Linux kernel to try and explain
> the remaining differences.

> Curiously if I remove the ABSOLUTE then I still get the error for
> relocations to __efistub__text, __efistub__end and __efistub__edata which
> are defined in the linker script

Ah, I found this thread from grimar@ from last year:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170116/420892.html
Seems like this issue has come up before.
Quuxplusone commented 5 years ago
Ard has authored 3 patches to resolve this issue.

http://lists.infradead.org/pipermail/linux-arm-kernel/2018-
November/616512.html: [PATCH] arm64: drop linker script hack to hide efistub
symbols
http://lists.infradead.org/pipermail/linux-arm-kernel/2018-
December/616754.html: [RFT PATCH] arm64: relocatable: build the kernel as a
proper shared library
http://lists.infradead.org/pipermail/linux-arm-kernel/2018-
December/616765.html: [RFT PATCH] arm64: add support for building the KASLR
kernel with LLVM lld

From my testing, I concur that this issue is resolved.  There may be more child
bugs to file from this exploration, but are no longer a blocker for us.  Thanks
everyone for your help.  Closing as invalid.  Will file new bugs should any of
the above patches not land for whatever reason.
Quuxplusone commented 5 years ago

Forked off: https://bugs.llvm.org/show_bug.cgi?id=39857

Quuxplusone commented 5 years ago
(In reply to Nick Desaulniers from comment #18)
> (In reply to Peter Smith from comment #11)
> > I've checked over ld.bfd behaviour and the Linux kernel to try and explain
> > the remaining differences.
>
> > Curiously if I remove the ABSOLUTE then I still get the error for
> > relocations to __efistub__text, __efistub__end and __efistub__edata which
> > are defined in the linker script
>
> Ah, I found this thread from grimar@ from last year:
> http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170116/420892.html
> Seems like this issue has come up before.

Sorry, my memory is weak here, this happened almost 2 years ago. As far I could
find the issue we had that time was resolved with
https://reviews.llvm.org/D29332.
Quuxplusone commented 5 years ago
I've put a reproducer in https://bugs.llvm.org/show_bug.cgi?id=39857 the
problem is of the form:

alias = section_relative_symbol;

SECTIONS {
    .text : { section_relative_symbol = . ; *(.text) }
}

alias is coming out as absolute, presumably as it is seen before
section_relative_symbol (not checked).
Quuxplusone commented 5 years ago
(In reply to Rui Ueyama from comment #17)
> I'm not sure if it is easy to discard .dynsym, .dynamic, and .dynstring
> safely, because these sections get a special treatment in the linker, but
> theoretically, yes, they should be discardable.

Patch for discarding .dynamic is https://reviews.llvm.org/D55211
Quuxplusone commented 5 years ago
(In reply to George Rimar from comment #23)
> (In reply to Rui Ueyama from comment #17)
> > I'm not sure if it is easy to discard .dynsym, .dynamic, and .dynstring
> > safely, because these sections get a special treatment in the linker, but
> > theoretically, yes, they should be discardable.
>
> Patch for discarding .dynamic is https://reviews.llvm.org/D55211

For .dynsym:
https://reviews.llvm.org/D55218

For .dynstr:
https://reviews.llvm.org/D55215
Quuxplusone commented 5 years ago

Looks like we don't need to allow these sections to be discarded anymore? I also don't think it generally doesn't make sense to discard these sections.

Quuxplusone commented 5 years ago
There are 3 patches on LKML at the moment, not yet landed in mainline:

http://lists.infradead.org/pipermail/linux-arm-kernel/2018-November/616512.html:
[PATCH] arm64: drop linker script hack to hide efistub symbols
http://lists.infradead.org/pipermail/linux-arm-kernel/2018-December/616754.html:
[RFT PATCH] arm64: relocatable: build the kernel as a proper shared
library
http://lists.infradead.org/pipermail/linux-arm-kernel/2018-December/616765.html:
[RFT PATCH] arm64: add support for building the KASLR kernel with LLVM

The middle one removes the discard of the .dynamic, .dynstr and .dynsym. My
understanding is that it is still desirable to permit the removal of these
sections to reduce image size.

In an embedded context it makes sense to remove these sections to save on
overall code-size for position independent executables with all relative (no
symbol needed) relocations.
Quuxplusone commented 5 years ago
Reopening, due to:
1. http://lists.infradead.org/pipermail/linux-arm-kernel/2018-
December/617179.html
2. http://lists.infradead.org/pipermail/linux-arm-kernel/2018-
December/617176.html
Quuxplusone commented 5 years ago

Patches to support discarding of the .dynamic, .dynstr and .dynsym were landed in r348746, r348748 and r348749.

Quuxplusone commented 5 years ago

Thanks for the patches George, and everyone else for the reviews and feedback. Looks like the only standing issue left for us on the kernel side is https://bugs.llvm.org/show_bug.cgi?id=39857, so I'll close out this bug and we can follow up there. Thanks again everyone for the lightning fast turnaround time on these bugs!