das-labor / panopticon

A libre cross-platform disassembler.
https://panopticon.re
GNU General Public License v3.0
1.43k stars 80 forks source link

Emulate ld-linux behaviour for ELF loading #83

Open flanfly opened 8 years ago

m4b commented 7 years ago

I think this is essentially done in https://github.com/m4b/dryad

Although it's not a library and has some limitations, the hard part is done as it can:

  1. Properly mmap the load sections in the elf binary
  2. It relocates runtime symbols to their proper vm address
  3. It performs runtime symbol resolution

I think 1 and 2 could be dropped in without modification into a simulated environment, with two caveats:

a. Load sections are never munmapped (somewhat easy to implement (I never bothered since os unmaps after program terminates), or might be best to switch to memmap crate for cross platform) b. Would have to turn off ifunc resolution or simulate it with RREIL since the dynamic linker executes the code c. I'm not sure how TLS would be involved in all this - will need to discuss this more d. Loading is host machine endianness (which is always the case for a real dynamic linker), so for starters we might have to restrict this to running on binaries with endianness of the host (and will have to think about how to do different endian loading (I suspect reading values will obey the binaries specified byte order but the actual memmap will be unchanged ?)

For 3 symbol resolution, e.g., setting the memmap address of imported symbols in the main executables GOT, we may want to just for starters assumeLD_BIND_NOW, e.g. Bind all imports on load, or perhaps when a particular symbol is selected in the UI? (Although bind now shouldn't really affect load time that much, symbol resolution is extremely fast)

Lastly I'd definitely be interested in turning dryad into an optional VM "loader" mode, and making it a library - but if it's desirable to have this done right away and useful, I think it could be hacked up with some copy paste in a fairly short amount of time.

Let me know what you have in mind/ whether this is still a desirable feature (I think it would actually distinguish panopticon in quite interesting ways), and we can discuss path forward from there.

flanfly commented 7 years ago

Hey @m4b. We definitely need a loader, but I don't think we need to support loading all linked dynamic libraries for now. The way I imagine it is that we add a symbol table that simply states that the address of symbol X is at address Y. The disassembler picks up that information and add a symbolic call to X. RREIL has no concept of threads so TLS does not need to be implemented. I'm not sure what's the problem with endianess. Panopticon (should) always know the byte order of the program it analyses and reads addresses accordingly.