GJDuck / e9patch

A powerful static binary rewriting tool
GNU General Public License v3.0
986 stars 67 forks source link

Read the PE symbols #51

Closed AiDaiP closed 2 years ago

AiDaiP commented 2 years ago

read the PE symbols and insert the functions into elf->plt.

GJDuck commented 2 years ago

Looking into this some more, the code only reads the Windows PE symbol table, which I think corresponds to the (non-dynamic) symbol table in ELF. So if the Windows PE file is stripped, the NumberOfSymbols will be zero meaning that no function addresses will be found. While this is useful for non-stripped binaries, I was hoping for something equivalent to what is currently implemented for ELF (as far as practically possible).

The WIndows PE equivalent to the PLT appears to be the Import Address Table, which is something different & I need to read up more about it.

AiDaiP commented 2 years ago

I also try to add some code to read the IAT but not in this pr. whether the PE file is stripped or not I can find the function symbol and address in IAT. for example I can find "malloc" and 0x929C in IAT.

.idata:000000018000929C ; void *__cdecl malloc(size_t Size)
.idata:000000018000929C                 extrn __imp_malloc:qword

I think IAT is more like ELF GOT.

;ELF GOT
extern:0000000000222B88 ; void *malloc(size_t size)
extern:0000000000222B88                 extrn malloc:near       ; CODE XREF: _malloc↑j
extern:0000000000222B88                                         ; DATA XREF: .got:malloc_ptr↑o

The function jump to malloc is it at 0x180002C20.

.text:0000000180002C20 ; void *__cdecl malloc(size_t Size)
.text:0000000180002C20 malloc          proc near               ; CODE XREF: create_pointer+14↑p
.text:0000000180002C20                 jmp     cs:__imp_malloc
.text:0000000180002C20 malloc          endp

like ELF PLT

;ELF PLT
.plt:0000000000003030 ; void *malloc(size_t size)
.plt:0000000000003030 _malloc         proc near               ; CODE XREF: yyalloc_0+13↓p
.plt:0000000000003030                                         ; yyparse+17F↓p ...
.plt:0000000000003030                 jmp     cs:malloc_ptr
.plt:0000000000003030 _malloc         endp

but if the PE file is stripped, read the Windows PE symbol table to find it is useless.

GJDuck commented 2 years ago

Thanks, this would be quite useful if it can be implemented.

GJDuck commented 2 years ago

It seems that Windows PE does things a bit differently. Calls to external functions (imported from DLLs) seem to be compiled into indirect jumps, e.g.:

     callq  *0xf5b(%rip)    # 0x140002000

Where 0x140002000 is an IAT entry. The IAT entry initially contains an indirect representation of the function name (e.g., GetCommandLineW), which is overridden by the actual function pointer at runtime.

So the call && target == &malloc idiom does not really work for this case, as there is no (statically known) address for malloc. I think you are correct that the IAT is analogous to the ELF GOT.

A way to match such instructions cannot really be represented in the current matching language. It would need to be something like call && &mem[0] == &malloc, where &mem[0] means "statically evaluate" the memory operand, if possible. For example, the above memory operand will statically evaluate to 0x140002000.

GJDuck commented 2 years ago

I've merged in an improved version of this PR that also includes parsing the IAT.

For some Windows binaries, there does appear to be something like a PLT, but it seems to just appear in the executable code without any special reference or marking. E.g., from dumpbin.exe:

  140001e30:   ff 25 5a 02 00 00       jmpq   *0x25a(%rip)        # 0x140002090
  140001e36:   ff 25 4c 02 00 00       jmpq   *0x24c(%rip)        # 0x140002088
  140001e3c:   ff 25 1e 03 00 00       jmpq   *0x31e(%rip)        # 0x140002160
  140001e42:   ff 25 10 03 00 00       jmpq   *0x310(%rip)        # 0x140002158
  ...

Here, the jumps are reading locations from the IAT.

Because these are not specially marked, these locations are not currently given symbolic names. But these jumps could still be targets for rewriting, using the idiom:

-M 'jump and &mem[0] == &malloc'

Where &malloc will resolve to the corresponding IAT entry.