GaloisInc / macaw

Open source binary analysis tools.
BSD 3-Clause "New" or "Revised" License
208 stars 21 forks source link

`macaw-base`: ELF loader spuriously warns on binary with overlapping PLT/relocation tables #416

Open RyanGlScott opened 3 months ago

RyanGlScott commented 3 months ago

To reproduce this bug, compile this program:

// test.c
int main(void) {
  return 0;
}

Using the PPC32 cross-compiler obtained from here:

$ powerpc-linux-musl-gcc test.c -o test.exe

And then load the resulting binary using this macaw-based program, which does nothing except load the test.exe binary and print any warnings that were emitted when calling memoryForElf:

-- Main.hs
{-# LANGUAGE OverloadedStrings #-}
module Main (main) where

import qualified Data.ByteString as BS
import qualified Data.ElfEdit as EE
import qualified Data.Macaw.Memory.ElfLoader as MME

main :: IO ()
main = do
  bytes <- BS.readFile "test.exe"
  case EE.decodeElfHeaderInfo bytes of
    Left (_off, msg) -> fail msg
    Right (EE.SomeElf e) -> do
      case MME.memoryForElf options e of
         Left err -> fail err
         Right (_mem, _sym, warnings, _err) ->
           mapM_ print warnings
  where
    options = MME.LoadOptions { MME.loadOffset = Just 0 }

Running this program will reveal the following warnings:

$ runghc Main.hs
Multiple relocations modify 20000.
Multiple relocations modify 20004.
Multiple relocations modify 20008.
Multiple relocations modify 2000c.

I claim that these warnings are spurious. If you look at the relocations in test.exe, we have:

$ readelf --relocs test.exe 

Relocation section '.rela.dyn' at offset 0x2ec contains 17 entries:
 Offset     Info    Type            Sym.Value  Sym. Name + Addend
0001fecc  00000016 R_PPC_RELATIVE               5d0
0001fed0  00000016 R_PPC_RELATIVE               54c
0001fed4  00000016 R_PPC_RELATIVE               6d0
0001fed8  00000016 R_PPC_RELATIVE               3e8
0001fedc  00000016 R_PPC_RELATIVE               61c
0001fee0  00000016 R_PPC_RELATIVE               20014
0001fee8  00000016 R_PPC_RELATIVE               20014
0001fef0  00000016 R_PPC_RELATIVE               20014
0001fef8  00000016 R_PPC_RELATIVE               20010
0001ff00  00000016 R_PPC_RELATIVE               734
0001ff08  00000016 R_PPC_RELATIVE               20018
00020010  00000016 R_PPC_RELATIVE               20010
0001fee4  00000501 R_PPC_ADDR32      00000000   _ITM_deregisterTM[...] + 0
0001feec  00000401 R_PPC_ADDR32      00000000   _ITM_registerTMCl[...] + 0
0001fef4  00000201 R_PPC_ADDR32      00000000   __cxa_finalize + 0
0001fefc  00000301 R_PPC_ADDR32      00000000   __deregister_fram[...] + 0
0001ff04  00000701 R_PPC_ADDR32      00000000   __register_frame_info + 0

Relocation section '.rela.plt' at offset 0x3b8 contains 4 entries:
 Offset     Info    Type            Sym.Value  Sym. Name + Addend
00020000  00000215 R_PPC_JMP_SLOT    00000000   __cxa_finalize + 0
00020004  00000315 R_PPC_JMP_SLOT    00000000   __deregister_fram[...] + 0
00020008  00000615 R_PPC_JMP_SLOT    00000000   __libc_start_main + 0
0002000c  00000715 R_PPC_JMP_SLOT    00000000   __register_frame_info + 0

Why is macaw giving spurious warnings about the .rela.plt relocations? It's because the addresses for the JMPREL table (i.e., the PLT table) with the addresses for the RELA table (i.e., the relocation table):

$ readelf --dynamic test.exe 

Dynamic section at offset 0xff0c contains 25 entries:
  Tag        Type                         Name/Value
<snip>
 0x00000002 (PLTRELSZ)                   48 (bytes)
 0x00000014 (PLTREL)                     RELA
 0x00000017 (JMPREL)                     0x3b8
 0x00000007 (RELA)                       0x2ec
 0x00000008 (RELASZ)                     252 (bytes)
 0x00000009 (RELAENT)                    12 (bytes)
<snip>

Note that the RELASZ (the size of the RELA table) is 252 bytes, so starting from the RELA address, we can see that it spans the range [0x2ec, 0x3e8). Moreover, the PLTRELSZ (the size of the JMPREL table) is 48 bytes, so starting from the JMPREL address, we can see that it spans the range [0x3b8, 0x3e8). This means that the JMPREL table completely overlaps with the RELA table.

macaw, on the other hand, currently assumes that the RELA table and the JMPREL table are completely disjoint, as seen in the implementation of dynamicRelocationTable. This will parse the entirety of the RELA table (using dynRelaBuffer/addElfRelaEntries here) followed by the entirety of the JMPREL table (using dynPLTRel/addRelaEntries here). Moreover, if the JMPREL table contains any relocations that were previously found when loading the RELA table, then macaw will emit this warning (the Multiple relocations modify warning seen above). Because the relocations from the .rela.plt section are contained in both the JMPREL and RELA tables, this causes macaw to warn about them in the example above.

As it turns out, this issue has already been reported before in https://github.com/GaloisInc/elf-edit/issues/40, but in an elf-edit context rather than a macaw one. Interestingly, not all gcc architectures exhibit this overlapping table behavior, as a PPC64 version of gcc does not do this. In order to avoid these spurious warnings, macaw will likely need to implement something similar to the algorithm described in https://github.com/GaloisInc/elf-edit/issues/40#issuecomment-1960582054, which is necessary to determine if the PLT/relocation tables overlap before attempting to load them.