GaloisInc / macaw

Open source binary analysis tools.
BSD 3-Clause "New" or "Revised" License
208 stars 21 forks source link

`macaw-ppc`: Generalize `ppc64_linux_info` to take a `Maybe (TOC w)` argument instead of a `LoadedBinary` #415

Open RyanGlScott opened 3 months ago

RyanGlScott commented 3 months ago

Compared to other ArchitectureInfo-producing functions, ppc64_linux_info is somewhat unusual in that it requires a LoadedBinary as an argument:

https://github.com/GaloisInc/macaw/blob/1a2b9284f11eaa67420d29641afe3d32ae1418af/macaw-ppc/src/Data/Macaw/PPC.hs#L91-L94

This is because knowing the location of the entrypoint address in complicated on PPC64. In some cases, PPC64 binaries can have an .opd section with a table of contents (TOC), which is used to translate function addresses to the actual locations where the functions are defined. See this part of macaw-ppc-loader for the full story.

Ultimately, ppc64_linux_info's LoadedBinary argument is used to obtain the TOC for a binary. This approach is somewhat unsatisfactory, however, for two reasons:

  1. Not all PPC64 binaries have .opd sections (see https://github.com/GaloisInc/macaw-loader/issues/21). For binaries without .opd sections, needing to pass in a TOC is overkill, since the TOC won't be used as part of the translation.
  2. It would be nice to develop a macaw-symbolic-syntax backend for PPC64 which consumes Crucible S-expression programs as input instead of binaries. This is currently impossible, however, as it is not possible to supply ppc64_linux_info with a LoadedBinary when the input isn't a binary in the first place.

To address both of these issues, I propose that we generalize the type of ppc64_linux_info to take a Maybe (TOC w) argument instead of a LoadedBinary argument. This way, one could pass Nothing to ppc64_linux_info whenever one has an .opd-less binary or a Crucible S-expression program as input. This approach would closely mirror how macaw-ppc's mkInitialAbsState function (which ppc64_linux_info invokes), which also takes a Maybe (TOC w) argument.

Note that it will likely be convenient to fix this issue and https://github.com/GaloisInc/macaw-loader/issues/21 at the same time. A reasonable way to fix https://github.com/GaloisInc/macaw-loader/issues/21 would be to change the return type of getTOC from TOC w to Maybe (TOC w). After this change has been made, one can call ppc64_linux_info (getTOC binary) to obtain an ArchitectureInfo value from a PPC64 binary.