Open whoismissing opened 8 months ago
The fork is currently available here under the sailr-decompiler
branch.
Listed here are the Function XML files used in the decompiler experiments. They can be loaded into the decomp_dbg
command line with the [decomp]> restore filename.xml
command.
foo.xml
<xml_savefile name="foo" target="default" adjustvma="0">
<binaryimage arch="x86:LE:64:default:gcc">
<bytechunk space="ram" offset="0x101190" readonly="true">
f30f1efa85ff7508b8ffffffffc3
</bytechunk>
<bytechunk space="ram" offset="0x1011a0" readonly="true">
53488d3d5c0e000089f3e8a1feffffb8
ffffffff85db7411488d3d510e0000e8
8cfeffffb8010000005bc3
</bytechunk>
</binaryimage>
<coretypes><type name="undefined" size="1" metatype="unknown" id="0xc000000000000000"/><type name="uint7" size="7" metatype="uint" id="0xc0000000000000c0"/><type name="uint" size="4" metatype="uint" id="0xc0000000000000c1"/><type name="code" size="1" metatype="code" id="0xe000000000000001"/><type name="double" size="8" metatype="float" id="0xc000000000000082"/><type name="ulong" size="8" metatype="uint" id="0xc0000000000000c3"/><type name="ushort" size="2" metatype="uint" id="0xc0000000000000c5"/><type name="float16" size="16" metatype="float" id="0xc000000000000085"/><type name="void" size="0" metatype="void" id="0xc0000000000000c6"/><type name="float2" size="2" metatype="float" id="0xc000000000000086"/><type name="wchar16" size="2" metatype="int" utf="true" id="0xc0000000000000c8"/><type name="float" size="4" metatype="float" id="0xc00000000000008a"/><type name="wchar_t" size="4" metatype="int" utf="true" id="0xc0000000000000ca"/><type name="int16" size="16" metatype="int" id="0xc000000000000090"/><type name="int3" size="3" metatype="int" id="0xc000000000000091"/><type name="int5" size="5" metatype="int" id="0xc000000000000092"/><type name="int6" size="6" metatype="int" id="0xc000000000000093"/><type name="int7" size="7" metatype="int" id="0xc000000000000094"/><type name="int" size="4" metatype="int" id="0xc000000000000095"/><type name="long" size="8" metatype="int" id="0xc000000000000097"/><type name="longdouble" size="10" metatype="float" id="0xc000000000000099"/><type name="short" size="2" metatype="int" id="0xc0000000000000a5"/><type name="sbyte" size="1" metatype="int" id="0xc0000000000000a6"/><type name="undefined1" size="1" metatype="unknown" id="0xc0000000000000b1"/><type name="undefined2" size="2" metatype="unknown" id="0xc0000000000000b2"/><type name="undefined3" size="3" metatype="unknown" id="0xc0000000000000b3"/><type name="undefined4" size="4" metatype="unknown" id="0xc0000000000000b4"/><type name="undefined5" size="5" metatype="unknown" id="0xc0000000000000b5"/><type name="undefined6" size="6" metatype="unknown" id="0xc0000000000000b6"/><type name="undefined7" size="7" metatype="unknown" id="0xc0000000000000b7"/><type name="undefined8" size="8" metatype="unknown" id="0xc0000000000000b8"/><type name="bool" size="1" metatype="bool" id="0xc000000000000079"/><type name="byte" size="1" metatype="uint" id="0xc00000000000007a"/><type name="char" size="1" metatype="int" char="true" id="0xc00000000000007b"/><type name="uint16" size="16" metatype="uint" id="0xc0000000000000bc"/><type name="uint3" size="3" metatype="uint" id="0xc0000000000000bd"/><type name="uint5" size="5" metatype="uint" id="0xc0000000000000be"/><type name="uint6" size="6" metatype="uint" id="0xc0000000000000bf"/></coretypes><save_state>
<typegrp intsize="4" longsize="8" structalign="4" enumsize="4" enumsigned="false"><type name="" metatype="ptr" size="8"><type name="char" id="0xc00000000000007b" metatype="int" size="1" char="true"/></type></typegrp><db scopeidbyname="false">
<scope name="" id="0x0">
<symbollist>
<mapsym><function id="0x39" name="foo" size="1"><addr space="ram" offset="0x101190"/><localdb lock="false" main="stack"><scope name="foo"><parent id="0x0"/><rangelist/><symbollist><mapsym><symbol id="0x58" name="param_1" typelock="true" namelock="true" readonly="false" merge="false" cat="0" index="0x0"><type name="int" id="0xc000000000000095" metatype="int" size="4"/></symbol><addr space="register" offset="0x38" size="4"/><rangelist><range space="ram" first="0x10118f" last="0x10118f"/></rangelist></mapsym><mapsym><symbol id="0x59" name="param_2" typelock="true" namelock="true" readonly="false" merge="false" cat="0" index="0x1"><type name="int" id="0xc000000000000095" metatype="int" size="4"/></symbol><addr space="register" offset="0x30" size="4"/><rangelist><range space="ram" first="0x10118f" last="0x10118f"/></rangelist></mapsym></symbollist></scope></localdb><prototype extrapop="8" model="__stdcall" modellock="true"><returnsym typelock="true"><addr space="register" offset="0x0" size="4"/><type name="int" id="0xc000000000000095" metatype="int" size="4"/></returnsym></prototype></function><addr space="ram" offset="0x101190" size="1"/><rangelist/></mapsym><mapsym><function id="0x49" name="puts" size="1"><addr space="ram" offset="0x101050"/><localdb lock="false" main="stack"><scope name="puts"><parent id="0x0"/><rangelist/><symbollist><mapsym><symbol id="0x56" name="__s" typelock="true" namelock="true" readonly="false" merge="false" cat="0" index="0x0"><type name="" metatype="ptr" size="8"><type name="char" id="0xc00000000000007b" metatype="int" size="1" char="true"/></type></symbol><addr space="register" offset="0x38" size="8"/><rangelist><range space="ram" first="0x10104f" last="0x10104f"/></rangelist></mapsym></symbollist></scope></localdb><prototype extrapop="8" model="unknown"><returnsym typelock="true"><addr space="register" offset="0x0" size="4"/><type name="int" id="0xc000000000000095" metatype="int" size="4"/></returnsym></prototype></function><addr space="ram" offset="0x101050" size="1"/><rangelist/></mapsym><mapsym><symbol name="s_first_print_00102004" typelock="true" namelock="true" readonly="true" merge="false" cat="-1"><type name="" metatype="array" size="12" arraysize="12"><typeref name="char" id="0xc00000000000007b"/></type></symbol><addr space="ram" offset="0x102004" size="12"/><rangelist/></mapsym><mapsym><symbol name="s_leaving_foo..._00102010" typelock="true" namelock="true" readonly="true" merge="false" cat="-1"><type name="" metatype="array" size="15" arraysize="15"><typeref name="char" id="0xc00000000000007b"/></type></symbol><addr space="ram" offset="0x102010" size="15"/><rangelist/></mapsym></symbollist>
</scope>
</db>
<context_points>
<context_pointset space="ram" offset="0x101190"><set name="longMode" val="1"/><set name="reserved" val="0"/><set name="addrsize" val="2"/><set name="bit64" val="1"/><set name="opsize" val="1"/><set name="segover" val="0"/><set name="highseg" val="0"/><set name="protectedMode" val="0"/><set name="repneprefx" val="0"/><set name="xacquireprefx" val="0"/><set name="prefix_f2" val="0"/><set name="mandover" val="0"/><set name="repprefx" val="0"/><set name="xreleaseprefx" val="0"/><set name="prefix_f3" val="0"/><set name="prefix_66" val="0"/><set name="rexWprefix" val="0"/><set name="rexWRXBprefix" val="0"/><set name="rexRprefix" val="0"/><set name="rexXprefix" val="0"/><set name="rexBprefix" val="0"/><set name="rexprefix" val="0"/><set name="vexMode" val="0"/><set name="vexL" val="0"/><set name="suffix3D" val="0"/><set name="vexVVVV" val="0"/><set name="vexMMMMM" val="0"/><set name="instrPhase" val="0"/><set name="lockprefx" val="0"/></context_pointset><tracked_pointset space="ram" offset="0x101190"><set space="register" offset="0x20a" size="1" val="0x0"/></tracked_pointset></context_points>
<commentdb/><stringmanage><string><addr space="ram" offset="0x102004"/><bytes trunc="false">6669727374207072696e7400
</bytes></string><string><addr space="ram" offset="0x102010"/><bytes trunc="false">6c656176696e6720666f6f2e2e2e00
</bytes></string></stringmanage><optionslist><readonly>on</readonly><setlanguage>c-language</setlanguage><protoeval>__stdcall</protoeval></optionslist></save_state>
</xml_savefile>
binop.xml
<xml_savefile name="binop" target="default" adjustvma="0">
<binaryimage arch="x86:LE:64:default:gcc">
<bytechunk space="ram" offset="0x100040" readonly="true">
4154488d35d719000041bc0100000055
4889fd4883ec08e8a42f000085c07413
488d35bb1900004889efe8912f000085
c0750d4883c4084489e05d415cc3
</bytechunk>
<bytechunk space="ram" offset="0x100080" readonly="true">
488d359e1900004889efe8712f000085
c074e0488d358e1900004889efe85e2f
000085c074cd488d357f1900004889ef
e84b2f000085c074ba488d3570190000
4889efe8382f000085c074a7488d3561
1900004889efe8252f000085c0749448
8d35521900004889efe8122f000085c0
7481488d35431900004889efe8ff2e00
0085c00f846affffff488d3530190000
4889efe8e82e000085c00f8453ffffff
488d351d1900004889efe8d12e000085
c00f843cffffff488d350a1900004889
efe8ba2e000085c0410f94c4e922ffff
ff
</bytechunk>
</binaryimage>
<coretypes><type name="undefined" size="1" metatype="unknown" id="0xc000000000000000"/><type name="uint7" size="7" metatype="uint" id="0xc0000000000000c0"/><type name="uint" size="4" metatype="uint" id="0xc0000000000000c1"/><type name="code" size="1" metatype="code" id="0xe000000000000001"/><type name="double" size="8" metatype="float" id="0xc000000000000082"/><type name="ulong" size="8" metatype="uint" id="0xc0000000000000c3"/><type name="ushort" size="2" metatype="uint" id="0xc0000000000000c5"/><type name="float16" size="16" metatype="float" id="0xc000000000000085"/><type name="void" size="0" metatype="void" id="0xc0000000000000c6"/><type name="float2" size="2" metatype="float" id="0xc000000000000086"/><type name="wchar16" size="2" metatype="int" utf="true" id="0xc0000000000000c8"/><type name="float" size="4" metatype="float" id="0xc00000000000008a"/><type name="wchar_t" size="4" metatype="int" utf="true" id="0xc0000000000000ca"/><type name="int16" size="16" metatype="int" id="0xc000000000000090"/><type name="int3" size="3" metatype="int" id="0xc000000000000091"/><type name="int5" size="5" metatype="int" id="0xc000000000000092"/><type name="int6" size="6" metatype="int" id="0xc000000000000093"/><type name="int7" size="7" metatype="int" id="0xc000000000000094"/><type name="int" size="4" metatype="int" id="0xc000000000000095"/><type name="long" size="8" metatype="int" id="0xc000000000000097"/><type name="longdouble" size="10" metatype="float" id="0xc000000000000099"/><type name="short" size="2" metatype="int" id="0xc0000000000000a5"/><type name="sbyte" size="1" metatype="int" id="0xc0000000000000a6"/><type name="undefined1" size="1" metatype="unknown" id="0xc0000000000000b1"/><type name="undefined2" size="2" metatype="unknown" id="0xc0000000000000b2"/><type name="undefined3" size="3" metatype="unknown" id="0xc0000000000000b3"/><type name="undefined4" size="4" metatype="unknown" id="0xc0000000000000b4"/><type name="undefined5" size="5" metatype="unknown" id="0xc0000000000000b5"/><type name="undefined6" size="6" metatype="unknown" id="0xc0000000000000b6"/><type name="undefined7" size="7" metatype="unknown" id="0xc0000000000000b7"/><type name="undefined8" size="8" metatype="unknown" id="0xc0000000000000b8"/><type name="bool" size="1" metatype="bool" id="0xc000000000000079"/><type name="byte" size="1" metatype="uint" id="0xc00000000000007a"/><type name="char" size="1" metatype="int" char="true" id="0xc00000000000007b"/><type name="uint16" size="16" metatype="uint" id="0xc0000000000000bc"/><type name="uint3" size="3" metatype="uint" id="0xc0000000000000bd"/><type name="uint5" size="5" metatype="uint" id="0xc0000000000000be"/><type name="uint6" size="6" metatype="uint" id="0xc0000000000000bf"/></coretypes><save_state>
<typegrp intsize="4" longsize="8" structalign="4" enumsize="4" enumsigned="false"><type name="undefined" id="0xc000000000000000" metatype="unknown" size="1"/><type name="" metatype="ptr" size="8"><type name="char" id="0xc00000000000007b" metatype="int" size="1" char="true"/></type></typegrp><db scopeidbyname="false">
<scope name="" id="0x0">
<symbollist>
<mapsym><function id="0x8" name="binop" size="1"><addr space="ram" offset="0x100040"/><localdb lock="false" main="stack"><scope name="binop"><parent id="0x0"/><rangelist/><symbollist/></scope></localdb><prototype extrapop="8" model="unknown"><returnsym><addr space="register" offset="0x0" size="1"/><typeref name="undefined" id="0xc000000000000000"/></returnsym></prototype></function><addr space="ram" offset="0x100040" size="1"/><rangelist/></mapsym><mapsym><function id="0xbd" name="strcmp" size="1"><addr space="ram" offset="0x103000"/><localdb lock="false" main="stack"><scope name="strcmp"><parent id="0x0"/><rangelist/><symbollist><mapsym><symbol id="0x12c" name="__s1" typelock="true" namelock="true" readonly="false" merge="false" cat="0" index="0x0"><type name="" metatype="ptr" size="8"><type name="char" id="0xc00000000000007b" metatype="int" size="1" char="true"/></type></symbol><addr space="register" offset="0x38" size="8"/><rangelist><range space="ram" first="0x102fff" last="0x102fff"/></rangelist></mapsym><mapsym><symbol id="0x12d" name="__s2" typelock="true" namelock="true" readonly="false" merge="false" cat="0" index="0x1"><type name="" metatype="ptr" size="8"><type name="char" id="0xc00000000000007b" metatype="int" size="1" char="true"/></type></symbol><addr space="register" offset="0x30" size="8"/><rangelist><range space="ram" first="0x102fff" last="0x102fff"/></rangelist></mapsym></symbollist></scope></localdb><prototype extrapop="8" model="unknown"><returnsym typelock="true"><addr space="register" offset="0x0" size="4"/><type name="int" id="0xc000000000000095" metatype="int" size="4"/></returnsym></prototype></function><addr space="ram" offset="0x103000" size="1"/><rangelist/></mapsym><mapsym><symbol id="0x30" name=".LC0" typelock="true" namelock="true" readonly="true" merge="false" cat="-1"><typeref name="undefined" id="0xc000000000000000"/></symbol><addr space="ram" offset="0x101a20" size="1"/><rangelist/></mapsym><mapsym><symbol id="0x31" name=".LC1" typelock="true" namelock="true" readonly="true" merge="false" cat="-1"><typeref name="undefined" id="0xc000000000000000"/></symbol><addr space="ram" offset="0x101a22" size="1"/><rangelist/></mapsym><mapsym><symbol id="0x32" name=".LC2" typelock="true" namelock="true" readonly="true" merge="false" cat="-1"><typeref name="undefined" id="0xc000000000000000"/></symbol><addr space="ram" offset="0x101a25" size="1"/><rangelist/></mapsym><mapsym><symbol id="0x33" name=".LC3" typelock="true" namelock="true" readonly="true" merge="false" cat="-1"><typeref name="undefined" id="0xc000000000000000"/></symbol><addr space="ram" offset="0x101a28" size="1"/><rangelist/></mapsym><mapsym><symbol id="0x34" name=".LC4" typelock="true" namelock="true" readonly="true" merge="false" cat="-1"><typeref name="undefined" id="0xc000000000000000"/></symbol><addr space="ram" offset="0x101a2c" size="1"/><rangelist/></mapsym><mapsym><symbol id="0x35" name=".LC5" typelock="true" namelock="true" readonly="true" merge="false" cat="-1"><typeref name="undefined" id="0xc000000000000000"/></symbol><addr space="ram" offset="0x101a30" size="1"/><rangelist/></mapsym><mapsym><symbol id="0x36" name=".LC6" typelock="true" namelock="true" readonly="true" merge="false" cat="-1"><typeref name="undefined" id="0xc000000000000000"/></symbol><addr space="ram" offset="0x101a34" size="1"/><rangelist/></mapsym><mapsym><symbol id="0x37" name=".LC7" typelock="true" namelock="true" readonly="true" merge="false" cat="-1"><typeref name="undefined" id="0xc000000000000000"/></symbol><addr space="ram" offset="0x101a38" size="1"/><rangelist/></mapsym><mapsym><symbol id="0x38" name=".LC8" typelock="true" namelock="true" readonly="true" merge="false" cat="-1"><typeref name="undefined" id="0xc000000000000000"/></symbol><addr space="ram" offset="0x101a3c" size="1"/><rangelist/></mapsym><mapsym><symbol id="0x39" name=".LC9" typelock="true" namelock="true" readonly="true" merge="false" cat="-1"><typeref name="undefined" id="0xc000000000000000"/></symbol><addr space="ram" offset="0x101a40" size="1"/><rangelist/></mapsym><mapsym><symbol id="0x3a" name=".LC10" typelock="true" namelock="true" readonly="true" merge="false" cat="-1"><typeref name="undefined" id="0xc000000000000000"/></symbol><addr space="ram" offset="0x101a44" size="1"/><rangelist/></mapsym><mapsym><symbol id="0x3b" name=".LC11" typelock="true" namelock="true" readonly="true" merge="false" cat="-1"><typeref name="undefined" id="0xc000000000000000"/></symbol><addr space="ram" offset="0x101a48" size="1"/><rangelist/></mapsym></symbollist>
</scope>
</db>
<context_points>
<context_pointset space="ram" offset="0x0"><set name="longMode" val="1"/><set name="reserved" val="0"/><set name="addrsize" val="2"/><set name="bit64" val="1"/><set name="opsize" val="1"/><set name="segover" val="0"/><set name="highseg" val="0"/><set name="protectedMode" val="0"/><set name="repneprefx" val="0"/><set name="xacquireprefx" val="0"/><set name="prefix_f2" val="0"/><set name="mandover" val="0"/><set name="repprefx" val="0"/><set name="xreleaseprefx" val="0"/><set name="prefix_f3" val="0"/><set name="prefix_66" val="0"/><set name="rexWprefix" val="0"/><set name="rexWRXBprefix" val="0"/><set name="rexRprefix" val="0"/><set name="rexXprefix" val="0"/><set name="rexBprefix" val="0"/><set name="rexprefix" val="0"/><set name="vexMode" val="0"/><set name="vexL" val="0"/><set name="suffix3D" val="0"/><set name="vexVVVV" val="0"/><set name="vexMMMMM" val="0"/><set name="instrPhase" val="0"/><set name="lockprefx" val="0"/></context_pointset><tracked_pointset space="ram" offset="0x100040"><set space="register" offset="0x20a" size="1" val="0x0"/></tracked_pointset></context_points>
<commentdb/><stringmanage><string><addr space="ram" offset="0x101a20"/><bytes trunc="false">3d00
</bytes></string><string><addr space="ram" offset="0x101a22"/><bytes trunc="false">213d00
</bytes></string><string><addr space="ram" offset="0x101a25"/><bytes trunc="false">3d3d00
</bytes></string><string><addr space="ram" offset="0x101a28"/><bytes trunc="false">2d6e7400
</bytes></string><string><addr space="ram" offset="0x101a2c"/><bytes trunc="false">2d6f7400
</bytes></string><string><addr space="ram" offset="0x101a30"/><bytes trunc="false">2d656600
</bytes></string><string><addr space="ram" offset="0x101a34"/><bytes trunc="false">2d657100
</bytes></string><string><addr space="ram" offset="0x101a38"/><bytes trunc="false">2d6e6500
</bytes></string><string><addr space="ram" offset="0x101a3c"/><bytes trunc="false">2d6c7400
</bytes></string><string><addr space="ram" offset="0x101a40"/><bytes trunc="false">2d6c6500
</bytes></string><string><addr space="ram" offset="0x101a44"/><bytes trunc="false">2d677400
</bytes></string><string><addr space="ram" offset="0x101a48"/><bytes trunc="false">2d676500
</bytes></string></stringmanage><optionslist><readonly>on</readonly><setlanguage>c-language</setlanguage><protoeval>__stdcall</protoeval></optionslist></save_state>
</xml_savefile>
Is your feature request related to a problem? Please describe.
As described in the SAILR decompiler research, compiler aware structuring algorithms may be a strategy for enhancing decompilation to more closely resemble the source code.
These optimizations include:
A motivating example of ISC as a result of cross jumping is the following source code:
Currently, when decompiled by Ghidra, this example results in the following pseudocode:
A real-world example of this is demonstrated by SAILR-EVAL which contains some examples from
Coreutils 9.1
.Below is Ghidra's current decompilation of the
binop()
function intest.o
:Describe the solution you'd like
Although
ActionReturnSplit
exists, it currently is unable to detect the above test cases withActionReturnSplit::gatherReturnGotos()
because theBlockGraph
s do not emitgotos
.As a preliminary proof-of-concept, I've experimented with a new BlockAction that is more lenient in detecting the cases above and present an example that
Funcdata.nodeSplit()
may be used to perform this deoptimization.With this new BlockAction, the above decompilation produces separate returns that were previously merged.
Describe alternatives you've considered
Alternative to introducing new
BlockAction
s, it could be better to extend the existingActionReturnSplit
andActionNodeJoin
BlockAction
s.Additional context
I attempted an experiment to implement ISD deoptimization by introducing a new
BlockAction
calledActionRevertISD
andDuplicateJoin
; however, the code pattern to match storage equivalent statements left this unimplemented because I found that detection of duplicate statements will require semantic comparison.I attempted to use CSE hashes for this purpose but I found that it does not apply to load or call statements.
For this and other use cases, it would be helpful to have some form of semantic comparison across PCode operations.
On a related note, this could help address https://github.com/NationalSecurityAgency/ghidra/issues/6014