uxmal / reko

Reko is a binary decompiler.
https://uxmal.github.io/reko
GNU General Public License v2.0
2.15k stars 253 forks source link

Providing additional information about entities (Procedure/Block/Instruction/Memory) ? #32

Open nemerle opened 9 years ago

nemerle commented 9 years ago

Considering that heuristics are used in many places, and the fact that sometimes humans actually "know better" :smile: what would be a good way of passing usable additional information to the engine ?

Non-exhaustive list of things we might want to do manually:

This should help solving #9

uxmal commented 9 years ago
  1. the ProcedureCharacteristics class has the property Terminates. No UI for editing it yet.
  2. The serialization format class Procedure_v1 has a Decompile flag that is intended to control decompilation. No UI for editing it yet, and I don't think it is respected by the scanner. Should be easy enough to implement
  3. Procedure preconditions were added yesterday. 4.
  4. Require selection capability in the TextView class (see #30 for details)
  5. Is implemented today. Try selecting a memory range and right-click on it, then select "Mark Type"
  6. Is implemented today. Right click on the starting byte of the procedure in the memory view and select "Mark Procedure Entry"
  7. Should be doable, but would need to be done carefully to make sure the call graph doesn't get horribly confused. So, if I have a procedure at address 0x1200 that contains call 0x1234 instruction, the scanner scans the procedure at 0x1234. If you then mark the code at 0x1234 as undecoded, what should happen to that call in procedure 0x1200?
nemerle commented 9 years ago

ad 7. Ah. 'fun' part -> we'd need a 'todo list' of conflicts :

- instruction at 0x1200 is a control transfer but target address (0x1234) is marked as unknown
- instruction at 0x1204 is a control transfer but target expression (ax+0x12) cannot be computed
- function (fun_1020) at (0x1022: call fun_1000), tried to assume (ds==0x5000) on (fun_1000) but it was already set to different value by a previous call from (fun_1010) at (0x1016: call fun_1000).
uxmal commented 9 years ago
  1. Should be able to mark the signature of a call site, so that indirect function calls can be used in type inference. E.g. seeing the instruction call [eax+020] the user should be able to mark that as (function __cdecl (int, int) => float) to indicate the type of the virtual function being called.
nemerle commented 9 years ago

Unless we can mark eax as a pointer to vtable struct which has something like :

offset 0x20: -> float (*__cdecl virt_member_20)(int, int);

in it's typedef, and can handle this kind of things :smile:

uxmal commented 9 years ago

Yep, that would also work. Still depends on #30, i.e. being able to select a line of code and or register and give it an annotation.

uxmal commented 9 years ago