`.bss` and `.data` garbage collection optimization

cr1901 commented 4 years ago

I just discovered this behavior tonight, and figured it's a worthwhile optimization: the GNU assembler and newlib work together to implement .bss and .data initialization garbage collection (e.g. if your .data section is empty, the code to initialize it will be GC'd at link time).

Does LLVM's assembler have any provisions for this? I see you already support the refsym directive. According to DJ Delorie's page:

The assembler has code to detect if either the data or bss sections are used, and if they are, it will use .refsym to tell crt0 to pull in snippets of code to initialize the RAM correctly. However, this means that most objects will now have extra "undefined" symbols that aren't part of your application: U __crt0_movedata

GNU as doesn't seem to use refsym directly; I think the important part is to inject undefined symbols without taking up binary space when .bss and/or .data sections are detected at assembly time.

asl commented 4 years ago

These all approaches look like a bit hackish way to solve the same problem. And I believe LTO would be the proper solution here – we may have a possibility to deliver all runtime libraries as LLVM IR and do whole program LTO :)

cr1901 commented 4 years ago

Sure, I'm not particularly attached to any one solution :).

And I believe LTO would be the proper solution here

I thought LTO couldn't make this optimization because it doesn't get the _edata/_ebss etc symbols until it's too late. My understanding is LTO means "compile down to an IR and when the linker is invoked, do the rest of the compilation process starting with the IR of all files". As if you had one big source file. And _edata/_ebss are provided after the IR is compiled to an object.

we may have a possibility to deliver all runtime libraries as LLVM IR and do whole program LTO :)

Is "whole program LTO" distinct from "LTO" without that "whole program" qualifier :P?

asl commented 4 years ago

In fact, yes. Typically there is whole bunch of libraries (e.g. C standard library) that come from the system and therefore cannot be used during the LTO. Here we have a rare possibility to control everything, for example, ship newlib both as object and LLVM IR and essentially include standard libraries into LTO process.

cr1901 commented 4 years ago

Here we have a rare possibility to control everything, for example, ship newlib both as object and LLVM IR and essentially include standard libraries into LTO process.

@asl That would be a very interesting approach to the problem I'm trying to solve (and we could get away w/ it since msp430 is so smol :P). As per usual, my use case is Rust- I have firmware that can fit into small (512 byte to 2kB) devices when maximum optimizations are enabled.

Optimizing away .data and/or .bss from the equivalent of crt0 when appropriate is one good optimization. I would have to think about how to convey this information from the Rust side to your LLVM so that the startup code from r0 becomes a candidate for your proposed LTO optimizations.¹

Seeing the GNU C compiler but not Rust perform .data/.bss optimization made me wonder what's different between Rust and C. Only to figure out that the GNU assembler hardcodes special logic for this optimization. Definitely open to something far more elegant like your whole program LTO proposal.

It is of course possible that the optimizations should be happening already, but I'm not sure how I would check this. Mind if I post some LLVM IR of a small test program for reference?

access-softek / llvm-project

`.bss` and `.data` garbage collection optimization #16