Open dcb314 opened 2 years ago
SECTIONS
linker script command is intentionally left out. Besides INPUT
, GROUP
and the like, I don't think we want to support any linker script command.
"unknown linker script token" looks like a bug. If it's intentional, maybe change it to something like "SECTIONS command is unimplemented, please use lld for full linker script support".
Good point! Will do.
Not implementing the SECTION command breaks (m)any usecases that involve building firmware images for embedded systems. E.g., linking firmware for the STM32 MCU family relies on a custom linker script that's basically (with some slight modifications based on the controller):
ENTRY(Reset_Handler)
MEMORY
{
RAM (xrw) : ORIGIN = 0x20000000, LENGTH = 96K
RAM2 (xrw) : ORIGIN = 0x10000000, LENGTH = 32K
FLASH (rx) : ORIGIN = 0x8000000, LENGTH = 1024K
}
SECTIONS
{
.isr_vector : {
. = ALIGN(8);
KEEP(*(.isr_vector)) /* Startup code */
. = ALIGN(8);
} >FLASH
.text : {
. = ALIGN(8);
*(.text)
*(.text*)
*(.glue_7)
*(.glue_7t)
*(.eh_frame)
KEEP (*(.init))
KEEP (*(.fini))
. = ALIGN(8);
_etext = .;
} >FLASH
.rodata : {
. = ALIGN(8);
*(.rodata)
*(.rodata*)
. = ALIGN(8);
} >FLASH
/* other sections */
.init_array : {
. = ALIGN(8);
PROVIDE_HIDDEN (__init_array_start = .);
KEEP (*(SORT(.init_array.*)))
KEEP (*(.init_array*))
PROVIDE_HIDDEN (__init_array_end = .);
. = ALIGN(8);
} >FLASH
.fini_array : {
. = ALIGN(8);
PROVIDE_HIDDEN (__fini_array_start = .);
KEEP (*(SORT(.fini_array.*)))
KEEP (*(.fini_array*))
PROVIDE_HIDDEN (__fini_array_end = .);
. = ALIGN(8);
} >FLASH
_sidata = LOADADDR(.data);
.data : {
. = ALIGN(8);
_sdata = .;
*(.data)
*(.data*)
. = ALIGN(8);
_edata = .;
} >RAM AT> FLASH
. = ALIGN(4);
.bss :
{
_sbss = .;
__bss_start__ = _sbss;
*(.bss)
*(.bss*)
*(COMMON)
. = ALIGN(4);
_ebss = .;
__bss_end__ = _ebss;
} >RAM
. = ALIGN(4);
.ram2 : {
_sram2 = .;
__ram2_start__ = _sram2;
*(.ram2)
*(.ram2*)
. = ALIGN(4);
_eram2 = .;
__ram2_end__ = _eram2;
} >RAM2
._user_heap_stack : {
. = ALIGN(8);
PROVIDE ( end = . );
PROVIDE ( _end = . );
. = . + _Min_Heap_Size;
. = . + _Min_Stack_Size;
. = ALIGN(8);
} >RAM
/* Remove information from the standard libraries */
/DISCARD/ : {
libc.a ( * )
libm.a ( * )
libgcc.a ( * )
}
}
Not implementing the SECTION command breaks (m)any usecases that involve building firmware images for >embedded systems.
Quite possibly. I checked all of the more than 7,000 packages in Fedora Linux and I could find only one package that required the feature.
For normal package builds on some host OS you're likely very fine without SECTION
support. Only when you need to produce image files that can be flashed onto some embedded MCU like Atmel/STM1 series you'll encounter the toolchains to use this kind of processing.
So while I don't see the feature on the top priority list of things, I'd at least suggest to keeping it in mind for further mid-term development.
SerenityOS's kernel linker script also uses sections, however not in a complex way. As far as I can tell, this way of using SECTIONS is well parallelizable and could benefit from mold as well.
FWIW, SerenityOS's userspace doesn't benefit from mold by any measurable degree. There's a lot of dynamically linked binaries and libraries (all libs&bins in userspace are dynamically linked) and the existing parallelization is enough to negate the mold linking benefit currently. The Kernel as the only statically linked large binary could benefit a lot from mold.
This also prevents building the Linux kernel, see module.lds.S
Not that you really need parallel linking for the kernel, it takes a few seconds on modern hardware with lld. I just thought it was an interesting use-case that the lack of SECTIONS support affects.
SECTIONS command is the core of the linker script functionality and core of its complexity. I want to implement a feature that can replace the usage of the linker script. One idea is explained here: https://www.reddit.com/r/programming/comments/p1ad4v/mold_a_modern_linker/h8e2jvs/?utm_source=reddit&utm_medium=web2x&context=3
FYI, this bug is preventing our Mozilla fork (which the Pale Moon browser uses) from using the mold linker. It immediately fails when it goes on to link libxul.so
:
1672653579.086131 mold: fatal: ../../../../platform/toolkit/library/StaticXULComponents.ld:1: SECTIONS {}
1672653579.086401 ^ unknown linker script token
1672653579.086662 collect2: error: ld returned 1 exit status
1672653579.086924 gmake[5]: *** [/home/job/Software/uxp-work/platform/config/rules.mk:773: libxul.so] Error 1
1672653579.087187 gmake[4]: *** [/home/job/Software/uxp-work/platform/config/recurse.mk:71: toolkit/library/target] Error 2
1672653579.087451 gmake[3]: *** [/home/job/Software/uxp-work/platform/config/recurse.mk:33: compile] Error 2
1672653579.087725 gmake[2]: *** [/home/job/Software/uxp-work/platform/config/rules.mk:494: default] Error 2
1672653579.087987 gmake[1]: *** [/home/job/Software/uxp-work/palemoon/client.mk:406: realbuild] Error 2
1672653579.088244 gmake: *** [client.mk:164: build] Error 2
@jobbautista9 Do you know why your program contains the empty SECTIONS
directive?
It isn't actually empty; according to toolkit/library/libxul.mk
, this SECTIONS
directive gets filled when the linker is bfd:
# BFD ld doesn't create multiple PT_LOADs as usual when an unknown section
# exists. Using an implicit linker script to make it fold that section in
# .data.rel.ro makes it create multiple PT_LOADs. That implicit linker
# script however makes gold misbehave, first because it doesn't like that
# the linker script is given after crtbegin.o, and even past that, replaces
# the default section rules with those from the script instead of
# supplementing them. Which leads to a lib with a huge load of sections.
ifneq (OpenBSD,$(OS_TARGET))
ifneq (WINNT,$(OS_TARGET))
ifdef LD_IS_BFD
OS_LDFLAGS += $(topsrcdir)/toolkit/library/StaticXULComponents.ld
endif
endif
endif
And the SECTIONS
directive when BFD is detected:
SECTIONS {
.data.rel.ro : {
*(.kPStaticModules)
}
}
This seems to be something we've inherited from Mozilla when we initially forked from mozilla-central 52.6.0: https://bugzilla.mozilla.org/show_bug.cgi?id=938437
Mozilla has moved on from this SECTIONS
solution in bug 1541792.
I'm not sure if we can backport this to the platform, and considering that Mozilla seems to have done this to reduce memory overhead for creating a new process (which is a problem we don't have because we intentionally don't use multi-process or e10s as they call it), I think it's unlikely we will do something about this SECTIONS
directive.
Now I don't know if it's possible to have it not put a SECTIONS
directive at all instead of putting an empty one if it doesn't detect bfd. I'm admittedly not that familiar with linker stuff...
That particular piece of linker script is to mark the .kPStaticModules
section into read-only section after process initialization. Can you locate the location in your code where the .kPStaticModules
is added? If so, by renaming .kPStaticModules.rel.ro
, you can do the same without using the linker script.
I've renamed all instances of kPStaticModules
I could find with xref, removed the linker script, and added a section insert of .kPStaticModules.rel.ro
in [config/expandlibs_exec.py](https://xref.palemoon.org/goanna-central/source/platform/config/expandlibs_exec.py#47)
, and I can confirm a successful build with clang 14/bfd and gcc 12/mold! I can't believe it's that easy; what the hell was Mozilla smoking at the time? Anyway, thanks so much for helping me! :D
I'm gonna do some more building and stability testing. Once I'm convinced the resulting binaries are as stable as gcc 9/bfd, I might pitch this to the lead developer. Hell maybe mold could make us revisit de-unifying our platform which we have to stop because it's overwhelming the poor bfd linker. (Actually I already have another solution for that problem which is to just switch the compiler to clang 14/15, and bfd actually will perform really well, linking libxul.so
in about 30 seconds with memory usage not even reaching 4 GB. But I wanted a better solution which is compiler-independent, so I turned to mold. With the relro patches, gcc 12/mold linked libxul
in 13 seconds, and I didn't even get to see how much memory it used... It's fast! :D)
Great! But 13 seconds are not considered fast in our standards :) Last time I tried it could be linked in a few seconds. How many cores does your machine have?
@jobbautista9 please, can you upload git-diff somewhere, I wanted to get mold+pm on my pc as it is the last package which was not linking on my pc. sorry for hijacking the thread.
@rui314 My 6th gen Intel Core i3 only has 2 cores with 4 threads. If it was built in a 20-core server it definitely would have been like a second. :)
@Kokokokoka I just finished doing some building and runtime stability tests, and I realized that the changes I did also fixed linking libxul
with gold
and lld
, which previously failed for me. I'm currently writing up an issue in the repo's bug tracker, and am drafting a PR. I will put up a link to the PR in an edit shortly. :)
EDIT: PR is up now! https://repo.palemoon.org/MoonchildProductions/UXP/pulls/2080
Can confirm, that this patch fixes pm build on gentoo with mold. it's a pity that lto build fails still with: NSModules are not ordered appropriately though.
GHC also needs SECTIONS
and has a driver/utils/merge_sections.ld
script:
/* Linker script to undo -split-sections and merge all sections together when
* linking relocatable object files for GHCi.
* ld -r normally retains the individual sections, which is what you would want
* if the intention is to eventually link into a binary with --gc-sections, but
* it doesn't have a flag for directly doing what we want. */
SECTIONS
{
.text : {
*(.text*)
}
.rodata.cst16 : {
*(.rodata.cst16*)
}
.rodata : {
*(.rodata*)
}
.data.rel.ro : {
*(.data.rel.ro*)
}
.data : {
*(.data*)
}
.bss : {
*(.bss*)
}
}
Instead of that linker script, please pass the --relocatable-merge-sections
option which tells the linker to do the same without linker script.
I'm not sure anymore about the syntax of this script, but we use SECTIONS to reserve space in the address-space of our programs, where we then map a large shared-memory segment to:
SECTIONS {
.shmdata 0x2000000 : {
*(.shmdata)
}
} INSERT AFTER .bss;
@RealLitb
I think you can do the same thing without the linker script just by passing -Wl,--section-start=.shmdata=0x2000000
.
I am also experiencing this issue when I set mold as my default linker in Fedora Linux
When trying to build the nvidia module using sudo akmod --force
4.14-200.fc38.x86_64/nvidia.mod.o; true
2023/09/10 11:39:13 akmodsbuild: # LD [M] /tmp/akmodsbuild.GCNUhS4t/BUILD/nvidia-kmod-535.104.05/_kmod_build_6.4.14-200.fc38.x86_64/nvidia-peermem.ko
2023/09/10 11:39:13 akmodsbuild: ld -r -m elf_x86_64 -z noexecstack --build-id=sha1 -T scripts/module.lds -o /tmp/akmodsbuild.GCNUhS4t/BUILD/nvidia-kmod-535.104.05/_kmod_build_6.4.14-200.fc38.x86_64/nvidia-peermem.ko /tmp/akmodsbuild.GCNUhS4t/BUILD/nvidia-kmod-535.104.05/_kmod_build_6.4.14-200.fc38.x86_64/nvidia-peermem.o /tmp/akmodsbuild.GCNUhS4t/BUILD/nvidia-kmod-535.104.05/_kmod_build_6.4.14-200.fc38.x86_64/nvidia-peermem.mod.o; true
2023/09/10 11:39:13 akmodsbuild: mold: fatal: scripts/module.lds:1: SECTIONS {
2023/09/10 11:39:13 akmodsbuild: ^ unknown linker script token
2023/09/10 11:39:13 akmodsbuild: make[2]: *** [scripts/Makefile.modfinal:61: /tmp/akmodsbuild.GCNUhS4t/BUILD/nvidia-kmod-535.104.05/_kmod_build_6.4.14-200.fc38.x86_64/nvidia-modeset.ko] Error 1
2023/09/10 11:39:13 akmodsbuild: make[2]: *** Waiting for unfinished jobs....
2023/09/10 11:39:13 akmodsbuild: mold: fatal: scripts/module.lds:1: SECTIONS {
2023/09/10 11:39:13 akmodsbuild: ^ unknown linker script token
2023/09/10 11:39:13 akmodsbuild: make[2]: *** [scripts/Makefile.modfinal:61: /tmp/akmodsbuild.GCNUhS4t/BUILD/nvidia-kmod-535.104.05/_kmod_build_6.4.14-200.fc38.x86_64/nvidia.ko] Error 1
2023/09/10 11:39:13 akmodsbuild: mold: fatal: scripts/module.lds:1: SECTIONS {
2023/09/10 11:39:13 akmodsbuild: ^ unknown linker script token
2023/09/10 11:39:13 akmodsbuild: make[2]: *** [scripts/Makefile.modfinal:61: /tmp/akmodsbuild.GCNUhS4t/BUILD/nvidia-kmod-535.104.05/_kmod_build_6.4.14-200.fc38.x86_64/nvidia-peermem.ko] Error 1
2023/09/10 11:39:13 akmodsbuild: make[1]: *** [Makefile:1970: modules] Error 2
2023/09/10 11:39:13 akmodsbuild: make[1]: Leaving directory '/usr/src/kernels/6.4.14-200.fc38.x86_64'
2023/09/10 11:39:13 akmodsbuild: make: *** [Makefile:82: modules] Error 2
This similarly makes it impossible to use mold to build the ZFS kernel modules. Perhaps it is possible to replicate the functionality of the linker script using objcopy or other similar commands, but I am not an expert. I understand all too well, however, not wanting to support something that you feel is outdated and overly complex. With that in mind, what does the path forwards look like for your proposed replacement for linker scripts? Is it in development, or are there ways that users could help support your efforts (aside from code contributions and donations, of course)?
Also unable to build libnvidia-container https://github.com/NVIDIA/libnvidia-container/blob/main/src/libnvidia-container.lds
@wallentx This Nvidia's linker script seems to be unnecessary. The SECTIONS
command doesn't seems to be anything meaningful. ENTRY
can be replaced with the -e
command line parmeter, and the VERSION
clause can be moved to a version script.
That indeed did the trick! Thanks!
@wallentx Would you mind reporting it back to the original Nvidia's repo?
@wallentx can you please share what you did to make it build?
@smac89 In addition to the changes I made to the Makefile
, and the introduction of a version.ver
file, I also had to apply a few patch files that were included within the PKGBUILD
from the AUR to get this to build on Arch, so YMMV with this. This does result in a working artifacts for me. https://github.com/NVIDIA/libnvidia-container/compare/v1.14.3...wallentx:libnvidia-container:wallentx/arch-build.patch
I feel obligated to warn you that this is all very much "here be dragons" territory for me, and I had no idea ldscripts existed prior to trying to solve this. I still don't really understand what work gets handed off to the linker, and why.. so.. use at your own risk, etc.
Since my last comment, I've noticed that a few different nvidia packages fail to build with mold, but I've only worked around this by temporarily disabling mold in my makepkg.conf
Package kvm-unit-tests seems to have a linker script as follows:
SECTIONS { . = 4M + SIZEOF_HEADERS; stext = .; .text : { (.init) (.text) (.text.) } . = ALIGN(4K); .data : { (.data) exception_table_start = .; (.data.ex) exception_table_end = .; } . = ALIGN(16); .rodata : { (.rodata) } . = ALIGN(16); .bss : { (.bss) } . = ALIGN(4K); edata = .; }
ENTRY(start)
mold says:
mold: /home/dcb35/rpmbuild/BUILD/kvm-unit-tests-1edfa966328dfc824e9b0351087bbfaf699dce04/x86/flat.lds:1: SECTIONS ^ unknown linker script token