Open WebFreak001 opened 4 years ago
An 8-bit controller target? Interesting. :) - What triple do you use to generate the .ll files? I guess we'd only need to enable the AVR target for our LLVM to support direct .o emission and linking via -gcc=avr-gcc
.
Have you checked whether the functions are emitted into separate sections in the object file? E.g., llvm-readelf --sections myobject.o
. That's a prerequisite for ld's stripping via --gc-sections
. I don't know how you generate the .o files from the .ll files, but as it's probably an LLVM tool, there might be a -function-sections
(and -data-sections
) command-line option for that.
I've tried 2 ways to build it right now, one of it already used -function-sections and -data-sections:
// using .o files from ldc
// build with -output-o into obj/ folder first
avr-gcc -mmcu=atmega1284p -Wall -Wl,"--gc-sections" obj/*.o -o project.elf
avr-objcopy -O ihex -R .eeprom project.elf project.hex
This way I get the following elf sections for the module which contains the templated struct, causing the bloat:
// using .ll files from ldc
// build with -output-ll into obj/ folder first
llvm-link -S -o project.ll obj/*.ll
opt -S -Oz --data-sections --function-sections --inline --strip-dead-prototypes --strip-dead-debug-info --strip project.ll --march=avr --mcpu=atmega1284p -o project.opt.ll" `
llc project.opt.ll --data-sections --function-sections --march=avr --mcpu=atmega1284p -filetype=obj -O2 -o project.o
avr-gcc -fdata-sections -mmcu=atmega1284p -Wall -Wl,"--gc-sections" project.o -o project.elf
avr-objcopy -O ihex -R .eeprom project.elf project.hex
This way I get the following elf sections: (of the combined project.o file here)
The full code is also at https://github.com/WebFreak001/avrd (requires a patched dub with the PR being open right now)
When reading the disassembly you can first see the good executable and then all the dead code though:
So I don't know what's causing them to be stuck in the binary or why nothing seems to have an effect.
As expected, you seem not to end up with separate sections. This:
void foo() {}
void bar() {}
yields something like this with ldc2 -c -mtriple=x86_64-pc-linux-gnu current.d && llvm-readelf --sections current.o
:
[Nr] Name Type Address Off Size ES Flg Lk Inf Al
[ 0] NULL 0000000000000000 000000 000000 00 0 0 0
[ 1] .strtab STRTAB 0000000000000000 0003d8 000155 00 0 0 1
[ 2] .text PROGBITS 0000000000000000 000040 000000 00 AX 0 0 4
[ 3] .group GROUP 0000000000000000 000168 000008 04 24 8 4
[ 4] .text._D7current3fooFZv PROGBITS 0000000000000000 000040 000006 00 AXG 0 0 16
[ 5] .group GROUP 0000000000000000 000170 000008 04 24 7 4
[ 6] .text._D7current3barFZv PROGBITS 0000000000000000 000050 000006 00 AXG 0 0 16
[ 7] .text.ldc.register_dso PROGBITS 0000000000000000 000060 000043 00 AX 0 0 16
[ 8] .rela.text.ldc.register_dso RELA 0000000000000000 0002e8 000060 18 24 7 8
[ 9] .group GROUP 0000000000000000 000178 000008 04 24 6 4
[10] .data._D7current12__ModuleInfoZ PROGBITS 0000000000000000 0000a8 000010 00 WAG 0 0 8
[11] __minfo PROGBITS 0000000000000000 0000b8 000008 00 WA 0 0 8
[12] .rela__minfo RELA 0000000000000000 000348 000018 18 24 11 8
[13] .bss.ldc.dso_slot NOBITS 0000000000000000 0000c0 000008 00 WA 0 0 8
[14] .group GROUP 0000000000000000 000180 000014 04 24 13 4
[15] .init_array INIT_ARRAY 0000000000000000 0000c0 000008 00 WAG 0 0 8
[16] .rela.init_array RELA 0000000000000000 000360 000018 18 G 24 15 8
[17] .fini_array FINI_ARRAY 0000000000000000 0000c8 000008 00 WAG 0 0 8
[18] .rela.fini_array RELA 0000000000000000 000378 000018 18 G 24 17 8
[19] .linker-options LLVM_LINKER_OPTIONS 0000000000000000 0000d0 000000 00 E 0 0 1
[20] .comment PROGBITS 0000000000000000 0000d0 000026 01 MS 0 0 1
[21] .note.GNU-stack PROGBITS 0000000000000000 0000f6 000000 00 0 0 1
[22] .eh_frame X86_64_UNWIND 0000000000000000 0000f8 000070 00 A 0 0 8
[23] .rela.eh_frame RELA 0000000000000000 000390 000048 18 24 22 8
[24] .symtab SYMTAB 0000000000000000 000198 000150 18 1 5 8
Notice the various .text.*
sections (which the linker will merge into a single final .text
section in the ELF binary). Your object files feature a 0-sized .text
section, and a .progmem.data
section at the same offset which seems to be the real deal, but it's all one 'big' section, so the linker cannot strip anything.
In case this is a bug or limitation of the AVR LLVM backend, you can also play around with custom sections:
import ldc.attributes;
@section(".progmem.data." ~ foo.mangleof)
void foo() {}
// => emitted into object file section `.progmem.data._D7current3fooFZv`
thank you! The section
trick did wonders and completely eliminated all junk.
This does seem a little bit like a hack though, isn't there a way to do this on LLVM level already instead of only removing it at the linker? Otherwise I would close this now as this seems to work exactly like I need it to
Inlining without emitting the inlined functions at all in IR would most likely entail something as ugly as DMD's approach, inlining at the AST level, and that's not likely going to happen.
Functions that are always inlined can be marked with available_externally
linkage in LLVM IR, such that they are not emitted in the object file. It is tricky: for example, if you take the address of such a function, then you'll get a linker error. Currently I don't think we have any means for the user to explicitly apply the available_externally
linkage type to functions.
bump, wanted to make a small library to test if x is typeof(x).init
(because I have both long type and long variable names which I wanted to cut down on), but it bloats the executable with lots of isDefault functions
Try using a function literal like this:
pragma(inline, true)
alias isDefault = (auto ref x) => x is typeof(x).init;
awesome, that works, even as operator overload!
I'm currently using LDC to generate LLVM IR files which I then generate to AVR compatible .o files and link using avr-gcc (calling avr-ld)
Now my problem is that I heavily use templated structs without fields with all methods static and force inline for syntactic sugar, but this generates around 3000 lines of useless LLVM IR in my project right now which blows up the resulting ELF file by over 2KB (increasing flash time by several seconds and leaving me with a lot less flash memory)
For reference my code:
My templated struct for transparent volatile pointers
```d // all this code does is translating this C define to D: // #define MEM(addr) (*((volatile T*)addr)) // so you can do // #define MEMX MEM(0x20) // and then use it using // MEMX = 4; // MEMX |= 8; /// Helper struct to automatically call volatileStore and volatileLoad on assignment/reading of pointers private template VolatileRef(T, alias addr) { private struct VolatileRef { alias get this; pragma(inline, true): // TODO: the following functions still make it into the resulting HEX file, even though they are unused if always inlined // they should be removed from the hex file somehow, but they don't get stripped or removed by -Wl,--gc-sections static T* ptr() { return cast(T*)(addr); } static T get() { return volatileLoad(ptr); } static void opAssign(T value) { volatileStore(ptr, value); } static auto opOpAssign(string op)(T value) { T ret; mixin("volatileStore(ptr, ret = cast(T)(volatileLoad(ptr) " ~ op ~ " value));"); return ret; } } } // enum MEM(alias addr) = VolatileRef!(ubyte, cast(ubyte*)addr); // there are like 200 of these defines in my code: // enum MEMX = MEM!0x20; // MEMX = 6; ```It would be great if it was possible to make LDC somehow maybe not even emit the functions as LLVM IR but instead always inline the IR code directly so it works cross module before optimization as well. (because of #3126)
Is there maybe an existing way (using LLVM) to strip out the unused methods? I am using
--fvisibility=hidden
, so they are actually marked as hidden in the .ll files but they are still being built and still exist in the final ELF file. I also tried--internalize
but that made the main function non-accessible or renamed it to something which the linker couldn't find. (the .ll file still was huge though)Otherwise it would be great to have a ldc.attributes.forceInline or something similar. Or maybe this would be fixed along with #2968 ?
Otherwise LDC is working great with embedded development using AVR! :)