NationalSecurityAgency / ghidra

Ghidra is a software reverse engineering (SRE) framework
https://www.nsa.gov/ghidra
Apache License 2.0
50.62k stars 5.79k forks source link

Open/Watcom C/C++ #156

Open majidf opened 5 years ago

majidf commented 5 years ago

Open/Watcom C/C++ compiled DOS binaries are not supported. Could the signatures be added?

lab313ru commented 5 years ago

Also, its calling convention must be added: eax, edx, ebx, ecx.

изображение

stevecheckoway commented 4 years ago

I created a .cspec file with the Watcom calling convention (although for 32-bit). I was not able to get Ghidra to correctly select the appropriate calling convention.

I'm not sure if I'm just missing something that makes it work. Here's my x86watcom.cspec file.

<?xml version="1.0" encoding="UTF-8"?>

<compiler_spec>
  <data_organization>
     <absolute_max_alignment value="0" />
     <machine_alignment value="2" />
     <default_alignment value="1" />
     <default_pointer_alignment value="4" />
     <pointer_size value="4" />
     <wchar_size value="4" />
     <short_size value="2" />
     <integer_size value="4" />
     <long_size value="4" />
     <long_long_size value="8" />
     <float_size value="4" />
     <double_size value="8" />
     <long_double_size value="12" />
     <size_alignment_map>
          <entry size="1" alignment="1" />
          <entry size="2" alignment="2" />
          <entry size="4" alignment="4" />
          <entry size="8" alignment="4" />
     </size_alignment_map>
  </data_organization>
  <global>
    <range space="ram"/>
  </global>
  <stackpointer register="ESP" space="ram"/>
  <returnaddress>
    <varnode space="stack" offset="0" size="4"/>
  </returnaddress>
  <default_proto>
    <prototype name="__cdecl" extrapop="4" stackshift="4">
      <input>
        <pentry minsize="1" maxsize="500" align="4">
          <addr offset="4" space="stack"/>
        </pentry>
      </input>
      <output killedbycall="true">
        <pentry minsize="4" maxsize="10" metatype="float" extension="float">
          <register name="ST0"/>
        </pentry>
        <pentry minsize="1" maxsize="4">
          <register name="EAX"/>
        </pentry>
        <pentry minsize="5" maxsize="8">
          <addr space="join" piece1="EDX" piece2="EAX"/>
        </pentry>
      </output>
      <unaffected>
        <register name="ESP"/>
        <register name="EBP"/>
        <register name="ESI"/>
        <register name="EDI"/>
        <register name="EBX"/>
      </unaffected>
      <killedbycall>
        <register name="ECX"/>
        <register name="EDX"/>
        <register name="ST0"/>
        <register name="ST1"/>
      </killedbycall>
      <likelytrash>
        <register name="EAX"/>
      </likelytrash>
    </prototype>
  </default_proto>
  <prototype name="__regparm4" extrapop="4" stackshift="4">
    <input>
      <pentry minsize="1" maxsize="4">
        <register name="EAX"/>
      </pentry>
      <pentry minsize="1" maxsize="4">
        <register name="EDX"/>
      </pentry>
      <pentry minsize="1" maxsize="4">
        <register name="ECX"/>
      </pentry>
      <pentry minsize="1" maxsize="4">
        <register name="EBX"/>
      </pentry>
      <pentry minsize="1" maxsize="500" align="4">
        <addr offset="4" space="stack"/>
      </pentry>
    </input>
    <output killedbycall="true">
      <pentry minsize="4" maxsize="10" metatype="float" extension="float">
        <register name="ST0"/>
      </pentry>
      <pentry minsize="1" maxsize="4">
        <register name="EAX"/>
      </pentry>
      <pentry minsize="5" maxsize="8">
        <addr space="join" piece1="EDX" piece2="EAX"/>
      </pentry>
    </output>
    <unaffected>
      <register name="ESP"/>
      <register name="EBP"/>
      <register name="ESI"/>
      <register name="EDI"/>
    </unaffected>
    <killedbycall>
      <register name="EAX"/>
      <register name="ECX"/>
      <register name="EDX"/>
      <register name="EBX"/>
      <register name="ST0"/>
      <register name="ST1"/>
    </killedbycall>
  </prototype>
  <prototype name="__regparm3" extrapop="4" stackshift="4">
    <input>
      <pentry minsize="1" maxsize="4">
        <register name="EAX"/>
      </pentry>
      <pentry minsize="1" maxsize="4">
        <register name="EDX"/>
      </pentry>
      <pentry minsize="1" maxsize="4">
        <register name="ECX"/>
      </pentry>
      <pentry minsize="1" maxsize="500" align="4">
        <addr offset="4" space="stack"/>
      </pentry>
    </input>
    <output killedbycall="true">
      <pentry minsize="4" maxsize="10" metatype="float" extension="float">
        <register name="ST0"/>
      </pentry>
      <pentry minsize="1" maxsize="4">
        <register name="EAX"/>
      </pentry>
      <pentry minsize="5" maxsize="8">
        <addr space="join" piece1="EDX" piece2="EAX"/>
      </pentry>
    </output>
    <unaffected>
      <register name="ESP"/>
      <register name="EBP"/>
      <register name="ESI"/>
      <register name="EDI"/>
      <register name="EBX"/>
    </unaffected>
    <killedbycall>
      <register name="EAX"/>
      <register name="ECX"/>
      <register name="EDX"/>
      <register name="ST0"/>
      <register name="ST1"/>
    </killedbycall>
  </prototype>
  <prototype name="__regparm2" extrapop="4" stackshift="4">
    <input>
      <pentry minsize="1" maxsize="4">
        <register name="EAX"/>
      </pentry>
      <pentry minsize="1" maxsize="4">
        <register name="EDX"/>
      </pentry>
      <pentry minsize="1" maxsize="500" align="4">
        <addr offset="4" space="stack"/>
      </pentry>
    </input>
    <output killedbycall="true">
      <pentry minsize="4" maxsize="10" metatype="float" extension="float">
        <register name="ST0"/>
      </pentry>
      <pentry minsize="1" maxsize="4">
        <register name="EAX"/>
      </pentry>
      <pentry minsize="5" maxsize="8">
        <addr space="join" piece1="EDX" piece2="EAX"/>
      </pentry>
    </output>
    <unaffected>
      <register name="ESP"/>
      <register name="EBP"/>
      <register name="ESI"/>
      <register name="EDI"/>
      <register name="EBX"/>
      <register name="ECX"/>
    </unaffected>
    <killedbycall>
      <register name="EAX"/>
      <register name="EDX"/>
      <register name="ST0"/>
      <register name="ST1"/>
    </killedbycall>
  </prototype>
  <prototype name="__regparm1" extrapop="4" stackshift="4">
    <input>
      <pentry minsize="1" maxsize="4">
        <register name="EAX"/>
      </pentry>
      <pentry minsize="1" maxsize="500" align="4">
        <addr offset="4" space="stack"/>
      </pentry>
    </input>
    <output killedbycall="true">
      <pentry minsize="4" maxsize="10" metatype="float" extension="float">
        <register name="ST0"/>
      </pentry>
      <pentry minsize="1" maxsize="4">
        <register name="EAX"/>
      </pentry>
      <pentry minsize="5" maxsize="8">
        <addr space="join" piece1="EDX" piece2="EAX"/>
      </pentry>
    </output>
    <unaffected>
      <register name="ESP"/>
      <register name="EBP"/>
      <register name="ESI"/>
      <register name="EDI"/>
      <register name="EBX"/>
      <register name="ECX"/>
      <register name="EDX"/>
    </unaffected>
    <killedbycall>
      <register name="EAX"/>
      <register name="ST0"/>
      <register name="ST1"/>
    </killedbycall>
  </prototype>
  <resolveprototype name="__cdecl/__regparm">
    <model name="__cdecl"/>        <!-- The default case -->
    <model name="__regparm4"/>
    <model name="__regparm3"/>
    <model name="__regparm2"/>
    <model name="__regparm1"/>
  </resolveprototype>
  <eval_current_prototype name="__cdecl/__regparm"/>

</compiler_spec>

You also need to add the line to the x86.ldefs file.

    <compiler name="Watcom" spec="x86watcom.cspec" id="watcom"/>

You may be able to modify this for the 16-bit case.

abelbriggs1 commented 4 years ago

I was able to get the custom calling convention to work by changing the language of the project (right click the file you're reversing in the project overview window, select 'set language'). Then I was able to select Watcom as the compiler.

0xBEEEF commented 3 years ago

I created a .cspec file with the Watcom calling convention (although for 32-bit). I was not able to get Ghidra to correctly select the appropriate calling convention.

@stevecheckoway Thanks for providing the file. It works perfectly! Could you maybe make an official Pull Request out of it? It would be an enormous gain if the compiler was still supported. Maybe this will help to add it even faster!

russdill commented 2 years ago

@stevecheckoway Thanks, your efforts have made a super difficult to analyze situation much much easier. I think the current state of ghdira is limiting things a bit though.

I currently have to manually select regparm1 vs regparm2, 3, 4, etc. I realize there can be situations where functions may make 4 arguments but take the 3 and 4th through the stack rather than registers, but it'd be helpful if the default right now were to somehow base the register usage count on the argument count. It's currently very painful for functions that take one argument, but default to regparam4 and it's assumed that all 4 registers are lost rather than saved.

I can't assign any of the watcom prototypes to a new function definition, function pointers, or overrides.

I didn't analyze too closely, but it appears there may be a slightly modified calling convention where EAX contains this and is preserved.

Analysis is often super over eager with 8 byte returns. Once a single function that takes a single argument is misidentified as returning EAX:EDX and taking two arguments EAX, EDX (rather than a single argument and returning EAX) the entire call tree gets polluted since EDX is seen as actively being passed around.

It'd also be useful to include __stdcall as watcom compiled executable often interface with windows APIs.

tomsons26 commented 2 years ago

ghidrawatcall.zip I wrote the cspec for Watcom's watcall extension for my needs. Only issue I've encountered is that Ghidra assumes all functions are watcall so cdecl and stdcall functions call conventions need correcting.

canassa commented 1 year ago

I tried using the included spec by @stevecheckoway. It worked for 99% of my functions but the problem is that all calls to Windows API are broken now so I had to revert to using custom storage

tomsons26 commented 1 year ago

Yeah my take on it has same issue, ghidra assumes everything is watcall then

user7 commented 1 year ago

Yeah my take on it has same issue, ghidra assumes everything is watcall then

Are you sure your definition of watcall is correct? In my experience it is as follows:

0-args: EAX is either clobbered (for void function) or holds return value. This applies to all other cases, denoted by 1-args: EAX = arg1; 2-args: EAX, EDX = arg1,2; EDX is clobbered, 3-args: EAX, EDX, EBX = arg1,2,3; EDX, EBX are clobbered, 4-args: EAX, EDX, EBX, ECX = arg1,2,3,4; EDX, EBX, ECX are clobbered, n-args: same as 4-args, but args > 4 are pushed on stack and function cleans up stack by using RETN X where X = (n_args - 4) 4

Your cspec mentions EBX is unaffected and ECX killed by call, but it's not true, e.g. EBX is affected if 3 or more args are passed, ECX isn't killed if 3 or less. Not sure what likelytrash(ECX) does, but also seems incorrect.

tomsons26 commented 1 year ago

Yeah my take on it has same issue, ghidra assumes everything is watcall then

Are you sure your definition of watcall is correct? In my experience it is as follows:

0-args: EAX is either clobbered (for void function) or holds return value. This applies to all other cases, denoted by 1-args: EAX = arg1; 2-args: EAX, EDX = arg1,2; EDX is clobbered, 3-args: EAX, EDX, EBX = arg1,2,3; EDX, EBX are clobbered, 4-args: EAX, EDX, EBX, ECX = arg1,2,3,4; EDX, EBX, ECX are clobbered, n-args: same as 4-args, but args > 4 are pushed on stack and function cleans up stack by using RETN X where X = (n_args - 4) 4

Your cspec mentions EBX is unaffected and ECX killed by call, but it's not true, e.g. EBX is affected if 3 or more args are passed, ECX isn't killed if 3 or less. Not sure what likelytrash(ECX) does, but also seems incorrect.

Def isn't correct as i have to fixup things sometimes using custom storage but good enough for my needs