fkie-cad / dewolf

A research decompiler implemented as a Binary Ninja plugin.
GNU Lesser General Public License v2.1
161 stars 9 forks source link

ValueError: At least two variables in an identity group have out degree zero #135

Open jnhols opened 1 year ago

jnhols commented 1 year ago

What happened?

The decompiler crashes with a ValueError in identity_elimination.

Traceback (most recent call last):
    File ""/home/ubuntu/.binaryninja/plugins/dewolf/decompile.py"", line 80, in <module>
main(Decompiler)
    File ""/home/ubuntu/.binaryninja/plugins/dewolf/decompiler/util/commandline.py"", line 65, in main
task = decompiler.decompile(function_name, options)
    File ""/home/ubuntu/.binaryninja/plugins/dewolf/decompile.py"", line 55, in decompile
pipeline.run(task)
    File ""/home/ubuntu/.binaryninja/plugins/dewolf/decompiler/pipeline/pipeline.py"", line 97, in run
instance.run(task)
    File ""/home/ubuntu/.binaryninja/plugins/dewolf/decompiler/pipeline/dataflowanalysis/identity_elimination.py"", line 312, in run
variable_replacer.replace_variables(identity_group, identity_graph.find_replacement_variable_of_group(identity_group))
    File ""/home/ubuntu/.binaryninja/plugins/dewolf/decompiler/pipeline/dataflowanalysis/identity_elimination.py"", line 225, in find_replacement_variable_of_group
raise ValueError(message)
ValueError: At least two variables in the identity group {rcx_7#44 (type: char * aliased: False), rcx_7#30 (type: char * aliased: False), var_128_1#7 (type: char * aliased: False), rcx_7#42 (type: char * aliased: False), r10_1#27 (type: char * aliased: False), r10_5#18 (type: char * aliased: False), 1 type: long, r10_1#1 (type: char * aliased: False), rcx_7#33 (type: char * aliased: False), var_128_1#18 (type: char * aliased: False), var_128_1#1 (type: char * aliased: False), r10_5#16 (type: char * aliased: False), rcx_7#41 (type: char * aliased: False), r10_1#12 (type: char * aliased: False), rcx_7#6 (type: char * aliased: False), rcx_18#37 (type: char * aliased: False), var_128_1#6 (type: char * aliased: False), r10_1#20 (type: char * aliased: False), r10_1#21 (type: char * aliased: False), var_128_1#3 (type: char * aliased: False), r10_1#26 (type: char * aliased: False), rcx_18#39 (type: char * aliased: False), r10_1#9 (type: char * aliased: False), var_128_1#16 (type: char * aliased: False), var_58#0 (type: char * aliased: False)} have out degree zero, namely 0x1 and r10_1#1, i.e., these set of vertices is not an identity group

How to reproduce?

Decompile main in the following sample.

seq.zip

Affected Binary Ninja Version(s)

3.2.3814

NeoQuix commented 1 year ago

Notes on the problem itself:

==> Some times uninitialized should be in the group of out-nodes (or completely purges from the dependency graph).

NeoQuix commented 1 year ago

Smaller Code which probably represents the same problem:

int f1(int argc){
    char* g_1 = "dump global str1";
    char* g_2 = "dump global str2";
    char* ptr;

    if(argc){
        g_1 = ptr;
    }else{
        g_2 = ptr;
    }

    return g_1;
}

Output of Decompiler:

void * f1(int arg1) {
    return "dump global str1";
}

The stage incorrectly merges var_10#0, var_20#3 and "dump global str1" into "dump global str1", completely destroying the correct control flow. Again var_10#0 (char* ptr) is not in the group of out-nodes, which would probably fix the problem.

Binary for testing: a.out.zip

NeoQuix commented 1 year ago

Note: All three functions in the binary have the same problem, but main behaves differently because of the return type. (Maybe other issue)

NeoQuix commented 1 year ago

Many more binaries have the problem. I will not upload them all, but the log which shows what function has a problem.

All binaries are part of the GNU core utils ValueError: At least.log

NeoQuix commented 1 year ago

Updated log for Windows bins: ValueError: At least two varia.log

ebehner commented 1 year ago

I guess I found the problem, at least for some of the samples, I did not check all :wink:

We build the identity graph and then remove edges such that each connected component can be an identity afterward. What we miss is the following case: We have the following phi-functions:

cfg: ie_bug_start

identity graph, where we try to identify each connected component: ie_graph_prune_one

ebehner commented 4 months ago

Issue #388 only solves some of the problems mentioned in this Issue. It does not address https://github.com/fkie-cad/dewolf/issues/135#issuecomment-1413464989