facebookarchive / BOLT

Binary Optimization and Layout Tool - A linux command-line utility used for optimizing performance of binaries
2.51k stars 176 forks source link

Support for Symbolizing Thread-Local Data in .tbss? #246

Open scottconstable opened 2 years ago

scottconstable commented 2 years ago

I'm trying to write a pass that inserts some instrumentation code, and which will store statistics in TLS (for example, .tbss). But I cannot figure out how to find MCSymbols for thread-local data. When I write:

for (auto &Symbol : BC.GlobalSymbols)
    outs() << Symbol.first() << '\n';

I see all of the symbols in .data and .bss, but I can't see any data that a given binary exposes in .tbss. I also do not see any members or methods on BinaryContext that reference TLS.

Is this supported? If not, are there plans to support it? Is there a workaround?

yota9 commented 2 years ago

Hello @scottconstable ! The TLS sections are ignored and mark as non allocatable in BinarySection isAllocatable() check during discoverFileObjects. Also be aware that there are some check like for address == 0 later, and it is OK for tls to have 0 address since basically it is the offset from TLS phdr. The are currently no reason to import this symbols since bolt doesn't change the tbss/tdata section, so in order to add something to them you will need to expand them, which will result also results in elf sections & in phdr patching. It is not too easy to do.

scottconstable commented 2 years ago

Hi @yota9 ! Thank you for the prompt response. I was considering either (a) inserting the new TLS symbol with BOLT, or (b) defining the new TSL symbol in the target program's source code. I would have preferred (a), but given your answer it sounds like I should settle for (b).

So suppose I have a program like this:

__thread uint64_t basic_block_count = 0;
int main() {
    printf("Hello, world!\n");
}

Ideally I would like to be able to compile a hello binary and then have BOLT load the binary, find the basic_block_count symbol, and insert an INC instruction at the beginning of each basic block to increment basic_block_count. I think I can do this with a lot of effort by scanning .dynsym and .dynstr to find basic_block_count's offset within .tbss and then manually constructing the FS-relative offset to form the memory address. But, is there an easier way to do this with BOLT APIs?

yota9 commented 2 years ago

The second approach is easier since you've already have the symbol and reserved space, it's true. After a few changes in discoverFileObjects you will be able to get your symbol by name in your pass and insert target-specific code to each BB to access tls variable and increment it. There is no ready functionality to do it but it doesn't sound like very difficult task..

scottconstable commented 2 years ago

@yota9 your analysis above was perfect. I added just 2 LoC into discoverFileObjects:

  1. I changed the isSymbolInMemory functor to return true if the symbol belongs to TLS.
  2. I changed the null-address check so that it is only applied to non-TLS symbols.

Now my pass can see the TLS symbols in the BinaryContext symbol table. However, when I create an MCInst that references one of these TLS symbols, BOLT emits a RIP-relative memory access, whereas I would have expected something like:

incq %fs:FFFFFFFFFFFFFFF8

Here is my pass code:

  auto &MIB = *BC.MIB;
  auto *BasicBlockCounterSymbol = BC.Ctx->lookupSymbol("__basic_blocks");
  assert(BasicBlockCounterSymbol && "Could not find symbol");
  MCInst IncInst;
  MIB.createIncMemory(IncInst, BasicBlockCounterSymbol, BC.Ctx.get());
  for (auto &It : BC.getBinaryFunctions()) {
    BinaryFunction &Function = It.second;
    for (BinaryBasicBlock &BB : Function) {
      BB.insertInstruction(BB.begin(), IncInst);
    }
  }

Do you have any suggestions?

yota9 commented 2 years ago

@scottconstable This is what I was talking about, that you will need to create your target-specific code. The createIncMemory doesn't know that this is the TLS symbol. I'm not sure about x86, I assume it is similar to ARM - you need access thread-register, knowing the data structure stored in it you will be able to access data knowing the offset (address) of TLS variable. Since it won't change you can use getAddress() as offset. But the createIncMemory doesn't access thread register & etc, it just increments the value in the memory, which is not right in this case. You will need to create your own smth like createIncTlsMemory to do it.

scottconstable commented 2 years ago

Thanks @yota9. I am now able to get the TLS offset in my pass by doing the following:

  const BinaryData *CounterBD = BC.getBinaryDataByName("__basic_blocks");
  uint64_t CounterOffset = CounterBD->getAddress();

I then use LLVM's existing MC infrastructure to build instructions with TLS-relative memory operands.

I have one more question. Sometimes the __basic_blocks symbol is imported by BOLT (or maybe by LLVM) with a numeric suffix, for example __basic_blocks/1. I can't figure out why this happens on some binaries but not others. I tried BinaryContext::postProcessSymbolTable() but this made no difference. Is there a reliable way to always find a unique symbol?

maksfb commented 2 years ago

/<N> is appended to all local symbol names to differentiate between multiple local symbols with the same name. Check the attributes of the symbol the compiler/linker is generating.