facebookarchive / BOLT

Binary Optimization and Layout Tool - A linux command-line utility used for optimizing performance of binaries
2.51k stars 176 forks source link

Handle end section labels #230

Closed yota9 closed 2 years ago

yota9 commented 2 years ago

We need to handle section end labels, that might be located right after last section byte. This patch is also the part of golang support by llvm-bolt.

Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei

rafaelauler commented 2 years ago

Hi Vladislav! Thanks for working on this.

If I understood correctly the intent here, you are updating the location of a symbol that is outside of a section and thus, is not captured by BOLT's regular mechanism of updating such symbols.

Symbols that are inside a function are updated at https://github.com/facebookincubator/BOLT/blob/main/bolt/src/RewriteInstance.cpp#L4643

Notice that we set "CheckPastEnd" equals to false there. I wonder if we change that to "true", if it will update such end labels? @maksfb do you think setting that to true would be a problem?

Also, is it easy to add a test case for us to keep track of this issue?

yota9 commented 2 years ago

@rafaelauler Thank you for your answer!

Notice that we set "CheckPastEnd" equals to false there

The problem here as I see that during the file discovering if the symbol is out of section we will end up with skipping the symbol here. And my golang pass currently relies on a few symbols that must exist in the binary. To be honest this is end label problem is not major one, since currently we need to use extern linkers for golang (usual ld/gold, since the internal gc linker doesn't have emit-relocs option). And with external linker the end symbol is not end of section symbol (we have some emitted _start & etc functions that are located beyond end symbol). But we have some experiments with enabling emit-relocs inside gc and that is where this problem begins :)

Also, is it easy to add a test case for us to keep track of this issue?

Sure, I will do it

yota9 commented 2 years ago

Hello @rafaelauler ! I've added the test, that checks that __etext symbol gets updated. It is a bit ugly, since I don't know an easy way to get end of section address.. Also I've placed it just in test folder, since it is not target-related and in the future I'm propose to place non target specific tests like this, what do you think? Thank you!

rafaelauler commented 2 years ago

There is one test failing in my machine: runtime/X86/hot_end_symbol.s

Can you repro that?

yota9 commented 2 years ago

@rafaelauler No, sorry all the tests are passed on my machine. I will try to take a look at this test closer tomorrow, maybe I will be able to understand the problem..

yota9 commented 2 years ago

@rafaelauler I'm unable to see the problem. Maybe you can attach the whole log of the test here? :)

rafaelauler commented 2 years ago

Sure, here it is: https://pastebin.com/THiNgA81

yota9 commented 2 years ago

@rafaelauler It seems to be that your libc embeddes some extra symbols:

        72: 0000000000a00020 t .annobin___libc_csu_fini.end
        73: 0000000000a00020 t .annobin_elf_init.c_end

That were placed at the end of init and fini sections and previously bolt just ignored them. My libc version doesn't have such symbols, that is why the test works fine

yota9 commented 2 years ago

@rafaelauler To remove libcs start file dependencies I've added simple _start function right in the test. Could you please check it now on your machine? Thank you!

yota9 commented 2 years ago

Hello @rafaelauler @maksfb, gentle ping on the rest of the reviews :)

maksfb commented 2 years ago

Hi Vladislav, thanks for the ping - will get back to you soon.