Closed llvmbot closed 7 years ago
The failure was introduced by r288107. I will follow up with details to the mailing list or a separate PR.
That's now PR 31221
Something is still not quite right and linking the FreeBSD bootloaders with that change does not work for me, although George reported success.
Yes, minimal scenario from this PR as well as building+linking complete source of usr/src/sys/boot/efi/boot1 loader is fixed for me with https://reviews.freebsd.org/D8681 applied.
Tested in a next way: root@freebsd:/usr/src/sys/boot/efi/boot1 make clean make objcopy \ -j .peheader -j .text -j .sdata -j .data \ -j .dynamic -j .dynsym -j .rel.dyn \ -j .rela.dyn -j .reloc -j .eh_frame \ --output-target=efi-app-x86_64 boot1.sym.full loader.efi
now loader.efi is workable.
Though I still observe issue with objcopy call from original makefile:
objcopy --only-keep-debug boot1.sym.full boot1.sym.debug
BFD: boot1.sym.debug: section .dynsym' can't be allocated in segment 4 objcopy: boot1.sym.debug: Bad value BFD: boot1.sym.debug: section
.dynsym' can't be allocated in segment 4
objcopy: boot1.sym.debug: Bad value
*** Error code 1
That issue can be fixed with passing -N to linker and after that result boot1.efi also works for me.
Ah, yes. That is an oddity of ld.bfd I had noticed before. If we can fix the freebsd code instead of implementing the same oddity in ld.lld that would be awesome.
Yep, I also noticed that earlier. I just do not think it is an oddity only of bfd, I think gold also do that since loader linked from source still has many left relocations, what means not all of them were relaxed and looks it "worked" because of the same reasons. (though I did not re-check) That is why I at first I supposed we might want to do the same, but did not find anything in ABI or somewhere else about storing addends in relocation targets.
I am closing this, since this one issue does not requires fixing on LLD side and we can open separate PR for others if any.
/libexec/ld-elf.so.1 is broken by a change between r287996 and r288228. I'm bisecting now.
The failure was introduced by r288107. I will follow up with details to the mailing list or a separate PR.
Ah, yes. That is an oddity of ld.bfd I had noticed before. If we can fix the freebsd code instead of implementing the same oddity in ld.lld that would be awesome.
Indeed. It's definitely a bug for FreeBSD to parse rela relocations as if they were rel. It would be hard to justify implementing the oddity in LLD, I think.
https://reviews.freebsd.org/D8681 fixes the FreeBSD self reloc code.
Something is still not quite right and linking the FreeBSD bootloaders with that change does not work for me, although George reported success. In my LLD-linked FreeBSD world+kernel test I'm seeing nearly all userland applications segfaulting, so I'm worried something has regressed in LLD in the last day or two. I'm rebuilding everything now to investigate further.
:-(
/libexec/ld-elf.so.1 is broken by a change between r287996 and r288228. I'm bisecting now.
So, bfd has the same two R_X86_64_RELATIVE but instead of using r_addend it writes the addend to the got entry? That is odd.
It's not instead of, it's both.
ld.bfd writes the same value in r_addend and at the relocation target in the got entry. Then if the runtime loader (self-reloc code in the bootloader case) incorrectly processes the relocation entries as if they were rel relocations, it "works." The FreeBSD EFI bootloader has a bug and relies on this GNU ld quirk right now; this is addressed by the FreeBSD code review linked above.
Ah, yes. That is an oddity of ld.bfd I had noticed before. If we can fix the freebsd code instead of implementing the same oddity in ld.lld that would be awesome.
Something is still not quite right and linking the FreeBSD bootloaders with that change does not work for me, although George reported success. In my LLD-linked FreeBSD world+kernel test I'm seeing nearly all userland applications segfaulting, so I'm worried something has regressed in LLD in the last day or two. I'm rebuilding everything now to investigate further.
:-(
So, bfd has the same two R_X86_64_RELATIVE but instead of using r_addend it writes the addend to the got entry? That is odd.
It's not instead of, it's both.
ld.bfd writes the same value in r_addend and at the relocation target in the got entry. Then if the runtime loader (self-reloc code in the bootloader case) incorrectly processes the relocation entries as if they were rel relocations, it "works." The FreeBSD EFI bootloader has a bug and relies on this GNU ld quirk right now; this is addressed by the FreeBSD code review linked above.
Something is still not quite right and linking the FreeBSD bootloaders with that change does not work for me, although George reported success. In my LLD-linked FreeBSD world+kernel test I'm seeing nearly all userland applications segfaulting, so I'm worried something has regressed in LLD in the last day or two. I'm rebuilding everything now to investigate further.
Looks nothing should be fixed on lld side. Though I wonder why ld has such behavior. I think we can close it then ?
It will be good to confirm that it indeed works with FreeBSD D8681 applied, when linked with LLD.
I don't know why GNU ld would have this odd behaviour, but I can see no reason we'd want to have LLD do this.
So, bfd has the same two R_X86_64_RELATIVE but instead of using r_addend it writes the addend to the got entry? That is odd.
Looks something with .got. It placed to .sdata by script. Out .got has 2 zero values, bfd has 2 non-zero values. gold does not generate .got because converts mov->lea.
If I edit slightly our code to match the gold, issue dissapears: if (Type ! = R_X86_64_GOTPCREL && // New condition. Type != R_X86_64_GOTPCRELX && Type != R_X86_64_REX_GOTPCRELX) return RelExpr;
Not relaxing R_X86_64_GOTPCREL is probably intentional, but this is probably not the root cause as you point out in the following messages.
Looks nothing should be fixed on lld side. Though I wonder why ld has such behavior. I think we can close it then ?
It will be good to confirm that it indeed works with FreeBSD D8681 applied, when linked with LLD.
I don't know why GNU ld would have this odd behaviour, but I can see no reason we'd want to have LLD do this.
So FreeBSD guys going to fix that on their side: https://reviews.freebsd.org/D8681
Looks nothing should be fixed on lld side. Though I wonder why ld has such behavior. I think we can close it then ?
Ok, I think now the reason is pretty clear for me.
Loader has next relocations: Relocation section '.rela.dyn' at offset 0x7000 contains 2 entries: Offset Info Type Sym. Value Sym. Name + Addend 000000004000 000000000008 R_X86_64_RELATIVE 0000000000003010 000000004008 000000000008 R_X86_64_RELATIVE 0000000000003008
0x4000 is the .sdata, where .got lives: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align [ 5] .sdata PROGBITS 0000000000004000 00005000 0000000000000010 0000000000000000 WA 0 0 8
Loader contains code for self relocations: Elf_Word relsz, relent; Elf_Addr newaddr; ElfW_Rel rel; ElfW_Dyn *dynp;
/*
* Find the relocation address, its size and the relocation entry.
*/
relsz = 0;
relent = 0;
for (dynp = dynamic; dynp->d_tag != DT_NULL; dynp++) {
switch (dynp->d_tag) {
case DT_REL:
case DT_RELA:
rel = (ElfW_Rel *)(dynp->d_un.d_ptr + baseaddr);
break;
case DT_RELSZ:
case DT_RELASZ:
relsz = dynp->d_un.d_val;
break;
case DT_RELENT:
case DT_RELAENT:
relent = dynp->d_un.d_val;
break;
default:
break;
}
}
/*
* Perform the actual relocation.
*/
for (; relsz > 0; relsz -= relent) {
switch (ELFW_R_TYPE(rel->r_info)) {
case RELOC_TYPE_NONE:
/* No relocation needs be performed. */
break;
case RELOC_TYPE_RELATIVE:
newaddr = (Elf_Addr *)(rel->r_offset + baseaddr);
/*
* For R_AARCH64_RELATIVE we need to calculate the
* delta between the address we are run from and the
* address we are linked at. As the latter is 0 we
* just use the address we are run from for this.
*/
*newaddr = baseaddr + rel->r_addend;
/* Address relative to the base address. */
*newaddr += baseaddr;
break;
default:
/* XXX: do we need other relocations ? */
break;
}
rel = (ElfW_Rel *)(void *)((caddr_t) rel + relent);
}
Notice that if ELF_RELA is not defined, it uses *newaddr += baseaddr; And ELF_RELA is not defined for amd64:
So it uses the value written in the got instead of relocation addend. We have zero values in got, what is different from bfd. That why issue happens it seems.
Looks something with .got. It placed to .sdata by script. Out .got has 2 zero values, bfd has 2 non-zero values. gold does not generate .got because converts mov->lea.
If I edit slightly our code to match the gold, issue dissapears: if (Type ! = R_X86_64_GOTPCREL && // New condition. Type != R_X86_64_GOTPCRELX && Type != R_X86_64_REX_GOTPCRELX) return RelExpr;
Can you attach the cpio of both cases?
Submitted.
I was incorrect here: conout = Xsystab->ConOut; //1 conout = systab->ConOut; //2
If line 1 is used, then it works, otherwise - not.
Can you attach the cpio of both cases?
Extended Description
sys/boot/efi/boot1 loader hangs in QEMU if linked with lld.
I was able to reduce its code to minimal case:
include <sys/param.h>
include <machine/elf.h>
include <machine/stdarg.h>
include
include
include
include "boot_module.h"
EFI_STATUS efi_main(EFI_HANDLE Ximage, EFI_SYSTEM_TABLE* Xsystab);
EFI_SYSTEM_TABLE systab; EFI_BOOT_SERVICES bs; static EFI_HANDLE *image;
EFI_STATUS efi_main(EFI_HANDLE Ximage, EFI_SYSTEM_TABLE Xsystab) { SIMPLE_TEXT_OUTPUT_INTERFACE conout = NULL; systab = Xsystab; image = Ximage; bs = Xsystab->BootServices;
//conout = Xsystab->ConOut; //1 conout = systab->ConOut; //2
conout->Reset(conout, TRUE); return EFI_SUCCESS; }
Now note that if line 1 is used, lld linked loader hangs. If line 2 is used, it does not. ld.bfd linked loader feels fine with any of these lines.
I am investigating the difference in output now.