ARM-software / abi-aa

Application Binary Interface for the Arm® Architecture
Other
911 stars 180 forks source link

Optional mapping symbol at the beginning of a text section #274

Open MaskRay opened 1 month ago

MaskRay commented 1 month ago

aaelf64 specifies (emphasis is mine):

Mapping symbols defined in a section (relocatable view) or segment (executable view) define a sequence of half- open intervals that cover the address range of the section or segment. Each interval starts at the address defined by the mapping symbol, and continues up to, but not including, the address defined by the next (in address order) mapping symbol or the end of the section or segment. A section that contains instructions must have a mapping symbol defined at the beginning of the section. If a section contains only data no mapping symbol is required. A platform ABI should specify whether or not mapping symbols are present in the executable view; they will never be present in a stripped executable file.

GNU assembler and LLVM integrated assembler behaviors:

Except the trivial cases (e.g. empty section),

While non-compliant, removing initial $x is hightly attractive due to the significant .symtab size reduction. Optimization both $d and $x eliminates 50.9% .symtab entries. (https://github.com/MaskRay/llvm-project/commits/a64-mapping-text/)

In my modified LLVM, the assembler no longer automatically adds an initial $x to a text section. Instead, it assumes that text sections start with an implicit $x and adds an ending $x only if the final data is not instructions. During the linking process, when sections are combined into the final output, the subsequent section picks up the A64 state.

.section .text1,"ax"
nop
// emit $d
.long 42
// emit $x

.section .text2,"ax"
nop

When .text1 is assembled using the traditional behavior, the absence of an initial $x symbol at .text2 (new behavior) might confuse disassemblers. In addition, a linker script combining non-text sections and text sections might confuse disassemblers.

However, this mix-and-match isn't a major issue for a significant portion of users.

 .o size   |   build |
 261345384 |   a64-0 | standard
 260443728 |   a64-1 | optimizing $d
 254106784 |   a64-2 | optimizing both $x
% ~/projects/bloaty/out/debug/bloaty a64-1/bin/clang -- a64-0/bin/clang
    FILE SIZE        VM SIZE
 --------------  --------------
  +0.0%     +16  +0.0%     +16    .text
  [ = ]       0  -1.8%     -16    .relro_padding
  -0.7%  -154Ki  [ = ]       0    .strtab
  -6.8%  -561Ki  [ = ]       0    .symtab
  -0.5%  -715Ki  [ = ]       0    TOTAL
% ~/projects/bloaty/out/debug/bloaty a64-2/bin/clang -- a64-0/bin/clang
    FILE SIZE        VM SIZE
 --------------  --------------
  -5.4% -1.13Mi  [ = ]       0    .strtab
 -50.9% -4.09Mi  [ = ]       0    .symtab
  -4.0% -5.22Mi  [ = ]       0    TOTAL

GNU assembler has additional state transition for implicit and explicit alignments. (See related code beside [PATCH][GAS][AARCH64]Fix "align directive causes MAP_DATA symbol to be lost" https://sourceware.org/bugzilla/show_bug.cgi?id=20364)

.section .foo1
.word 0         // no $d

.section .foo2
.balign 4       // $d
.word 0

Compiler-generated .debug_* don't have alignment directives, and therefore they don't have $d.

smithp35 commented 1 month ago

Will need to have a check with some colleagues on the impact of making the change. I'm more worried about the impact on binary scanners (something like BOLT, or the ERRATA fixes in linkers) as they need to avoid scanning code as data, and vice-versa.

A possible rewriting would be to add an implicit $d at offset 0 if a section has no mapping symbols and does not have SHF_EXECINSTR flag. Add an implicit $x at offset 0 if a section has no mapping symbols but does have the SHF_EXECINSTR flag.

The case where this would change behaviour from today is if there is a section with SHF_EXECINSTR but contains data and has omitted the mapping symbol. Something like:

 .section .text.reallydata, "ax", %progbits
 .word 0x1
 // more data statements

Under the current rules would be correctly disassembled as data, but under the new rules with an implicit $x it would be incorrectly disassembled as code.

Code following the new rules would add a $d but older objects may exist.

MaskRay commented 1 month ago

Will need to have a check with some colleagues on the impact of making the change. I'm more worried about the impact on binary scanners (something like BOLT, or the ERRATA fixes in linkers) as they need to avoid scanning code as data, and vice-versa.

In lld/ELF/AArch64ErrataFix.cpp, the code around auto codeSym = mapSyms.begin(); needs to be modified to assume texthat SHF_EXECINSTR sections have an implicit $x at offset 0.

Post-link optimizers like BOLT should be fine as long as have the same implicit $d/$x assumption.

In addition, Mach-O uses a range-based implementation:

struct data_in_code_entry {
    uint32_t    offset;  /* from mach_header to start of data range*/
    uint16_t    length;  /* number of bytes in data range */
    uint16_t    kind;    /* a DICE_KIND_* value  */
};

Disassemblers (including profilers and post-link optimizers) working with Mach-O essentially has to assume implicit $d/$x.

A possible rewriting would be to add an implicit $d at offset 0 if a section has no mapping symbols and does not have SHF_EXECINSTR flag. Add an implicit $x at offset 0 if a section has no mapping symbols but does have the SHF_EXECINSTR flag.

Yes, this is the assembler implementation strategy. The ending mapping symbol is to avoid mapping symbol insertion when transiting between sections in the linker.

The case where this would change behaviour from today is if there is a section with SHF_EXECINSTR but contains data and has omitted the mapping symbol. Something like:

 .section .text.reallydata, "ax", %progbits
 .word 0x1
 // more data statements

Under the current rules would be correctly disassembled as data, but under the new rules with an implicit $x it would be incorrectly disassembled as code.

Code following the new rules would add a $d but older objects may exist.

I apologize if my previous explanation wasn't entirely clear. This example should work fine.

In my modified LLVM integrated assembler, $d is inserted at offset 0 for .word, and ending $x is inserted at offset 4.

Assemblers use a state machine to determine when to insert mapping symbols. LLVM integrated assembler assumes sections begin in the EMS_None state, and any initial data (instruction or not) requires a mapping symbol.

My modified assembler assumes the initial state is EMS_A64 for SHF_EXECINSTR sections. This avoids unnecessary mapping symbols for initial instructions while still emitting $d for initial data.

smithp35 commented 1 month ago

After discussion internally, we think it is best to keep with the existing text.

The original text is written that way so that a linker can combine sections without needing to deal with mapping symbols spilling over from the previous section. For example:

  .text
fn:
  ldr x0,=external
  ret

Will produce

    $x
        LDR      x0,{pc}+8 ; 0x8 ; [0x8] =
        RET
    $d
        .xword   0 // R_AARCH64_ABS64 external

If at link time another executable section follows, under the current ABI rules the $x at the start of the section will terminate the range of the $d without the linker taking any action. However if the $x is implicit at the start of the next executable section then the linker would need to insert a $x. The ABI preferred for the format to just work without the linker needing to understand mapping symbols.

There is scope for a smart enough linker to remove the redundant mapping symbols from the output static symbol table. For example Arm's proprietary linker will do this. As mapping symbols are only in the static symbol table (not SHF_ALLOC) it should be possible to do this in LLD and GNU ld if the goal is to reduce the symbol table size, presumably this could help debugging unstripped executables as there are fewer symbols to deal with.

rearnsha commented 1 month ago

The ABI has always used the principle of ‘dumb format, smart tools’. That means that whenever there is a choice to make, as here, we favour the form that allows default behaviours in the tools to get a correct answer.

in this case eliding some mapping symbols would mean that linkers would have to know how to reinsert missing mappings ( eg because the previous section in a newly merged section ended with a literal pool - marked with $d), and thus by default would generate the wrong output.

I think for cases like this we would prefer smart linkers to elide redundant mapping symbols if they are being smart, rather than requiring all linkers to learn how to insert them.

algrant-arm commented 1 month ago

If mapping symbols were allowed and recommended to have sizes, and a default symbol defined for each section type, then it would be possible to remove many symbols entirely, and not just at the start of sections, perhaps achieving a significant reduction in symtab sizes. It would also reduce the risk alluded to above of mappings "running on" when concatenated with a following unmapped code or data section - a problem we already see.

E.g. any executable section in a 64-bit image could be mapped as $x by default, and a sized $d would denote a specific range of literals and a zero-sized $d would map until the next mapping symbol as per legacy behavior.

MaskRay commented 1 month ago

However if the $x is implicit at the start of the next executable section then the linker would need to insert a $x.

My alternative scheme solves the problem by inserting an ending $x at the previous section. This scheme doesn't necessitate a smart linker.

--

AArch32 code is often interleaved with constant pools and jump tables, which necessitates the use of mapping symbols. Additionally, some embedded system linkers might combine code and data sections.

However, AArch64 typically avoids these techniques. For example, building Clang with https://github.com/llvm/llvm-project/pull/99718 doesn't generate any(?) mapping symbols with the integrated assembler.

This raises a question: could the current ABI wording be overly restrictive for general use cases? Why does .rodata and .data necessitate $d symbols? While a smart linker might remove unnecessary symbols, eliminating them entirely from relocatable files would lead to smaller code sizes and create a fairer comparison with x86-64.

Ideally, we'd have an LLVM internal option or assembler flag that caters to the needs of most non-embedded development scenarios.

rearnsha commented 1 month ago

My alternative scheme solves the problem by inserting an ending $x at the previous section.

I’m not sure that’s well defined. You would have a symbol that lies outside of the contents of the file. Furthermore, you would then get a conflict if the section that followed defined a different mapping symbol at the same address. You would then require a smart linker to sort out the problem.

rearnsha commented 1 month ago

If mapping symbols were allowed and recommended to have sizes, and a default symbol defined for each section type, then it would be possible to remove many symbols entirely, and not just at the start of sections, perhaps achieving a significant reduction in symtab sizes.

That wouldn’t, on its own, allow you to eliminate a mapping symbol at the start of every input section, because it is only a recommendation. A linker would still be required to implement some smart logic to sort out inputs with different conventions. A single object file with an unsized $d at the end of a code section would otherwise change the mapping of the next object.

rearnsha commented 1 month ago

The ABI has always used the principle of ‘dumb format, smart tools’.

I think I got this statement backwards, btw. It was “smart format, dumb tools”. The idea being that dumb tools will do enough with the image to produce a working image, but still allowing a smart implementation to do even better. Either way, the point is that there must be enough information in the incoming object files for any linker to generate correct output without needing to be extended beyond the basic rules for linking.

MaskRay commented 1 month ago

My alternative scheme solves the problem by inserting an ending $x at the previous section.

I’m not sure that’s well defined. You would have a symbol that lies outside of the contents of the file. Furthermore, you would then get a conflict if the section that followed defined a different mapping symbol at the same address. You would then require a smart linker to sort out the problem.

This is not a significant concern. The use case is rare and many users could accept risk mistakenly identifying the initial data directives as code, since they don't use the features. Ultimately I just want an opt-in driver option -Wa,--optimize-mapping-symbols, which hopefully gas will add something similar.

I've added some information to https://maskray.me/blog/2024-07-21-mapping-symbols-rethinking-for-efficiency , which is copied below:

A text section may rarely start with data directives (e.g., -fsanitize=function, LLVM prefix data). When the linker combines two such sections, the ending $x of the first section and the initial $d of the second might have the same address.

.section .text.0, "ax"
// $d
.word 0
// $x this

.section .text.1, "ax"
// $d may have the same address
.word 0
// $x

In a straightforward implementation, symbols are stable-sorted by address and the last symbol at an address wins. Ideally we want $d $x $d $x. If the sections are in different files, a linker that respects input order will naturally achieves this. If they're in the same file, the assembler should output $d $x $d $x instead of $d $d $x $x. This works if .text.0 precedes .text.1 in the linker output, but the other section order might be unexpected. In the worst case where the linker's section order mismatches the assembler's section order (usually necessitating a linker script, which many users don't specify), the initial data directives could be mistakenly identified as code.


Again, this is a risk, but a significant portion of users don't care.