Closed AiDaiP closed 2 years ago
Looking into this some more, the code only reads the Windows PE symbol table, which I think corresponds to the (non-dynamic) symbol table in ELF. So if the Windows PE file is stripped, the NumberOfSymbols
will be zero meaning that no function addresses will be found. While this is useful for non-stripped binaries, I was hoping for something equivalent to what is currently implemented for ELF (as far as practically possible).
The WIndows PE equivalent to the PLT appears to be the Import Address Table, which is something different & I need to read up more about it.
I also try to add some code to read the IAT but not in this pr. whether the PE file is stripped or not I can find the function symbol and address in IAT. for example I can find "malloc" and 0x929C in IAT.
.idata:000000018000929C ; void *__cdecl malloc(size_t Size)
.idata:000000018000929C extrn __imp_malloc:qword
I think IAT is more like ELF GOT.
;ELF GOT
extern:0000000000222B88 ; void *malloc(size_t size)
extern:0000000000222B88 extrn malloc:near ; CODE XREF: _malloc↑j
extern:0000000000222B88 ; DATA XREF: .got:malloc_ptr↑o
The function jump to malloc is it at 0x180002C20.
.text:0000000180002C20 ; void *__cdecl malloc(size_t Size)
.text:0000000180002C20 malloc proc near ; CODE XREF: create_pointer+14↑p
.text:0000000180002C20 jmp cs:__imp_malloc
.text:0000000180002C20 malloc endp
like ELF PLT
;ELF PLT
.plt:0000000000003030 ; void *malloc(size_t size)
.plt:0000000000003030 _malloc proc near ; CODE XREF: yyalloc_0+13↓p
.plt:0000000000003030 ; yyparse+17F↓p ...
.plt:0000000000003030 jmp cs:malloc_ptr
.plt:0000000000003030 _malloc endp
but if the PE file is stripped, read the Windows PE symbol table to find it is useless.
Thanks, this would be quite useful if it can be implemented.
It seems that Windows PE does things a bit differently. Calls to external functions (imported from DLLs) seem to be compiled into indirect jumps, e.g.:
callq *0xf5b(%rip) # 0x140002000
Where 0x140002000
is an IAT entry. The IAT entry initially contains an indirect representation of the function name (e.g., GetCommandLineW
), which is overridden by the actual function pointer at runtime.
So the call && target == &malloc
idiom does not really work for this case, as there is no (statically known) address for malloc
. I think you are correct that the IAT is analogous to the ELF GOT.
A way to match such instructions cannot really be represented in the current matching language. It would need to be something like call && &mem[0] == &malloc
, where &mem[0]
means "statically evaluate" the memory operand, if possible. For example, the above memory operand will statically evaluate to 0x140002000
.
I've merged in an improved version of this PR that also includes parsing the IAT.
For some Windows binaries, there does appear to be something like a PLT, but it seems to just appear in the executable code without any special reference or marking. E.g., from dumpbin.exe
:
140001e30: ff 25 5a 02 00 00 jmpq *0x25a(%rip) # 0x140002090
140001e36: ff 25 4c 02 00 00 jmpq *0x24c(%rip) # 0x140002088
140001e3c: ff 25 1e 03 00 00 jmpq *0x31e(%rip) # 0x140002160
140001e42: ff 25 10 03 00 00 jmpq *0x310(%rip) # 0x140002158
...
Here, the jumps are reading locations from the IAT.
Because these are not specially marked, these locations are not currently given symbolic names. But these jumps could still be targets for rewriting, using the idiom:
-M 'jump and &mem[0] == &malloc'
Where &malloc
will resolve to the corresponding IAT entry.
read the PE symbols and insert the functions into elf->plt.