fkie-cad / dewolf

A research decompiler implemented as a Binary Ninja plugin.
GNU Lesser General Public License v2.1
171 stars 9 forks source link

[Pattern independent restructuring] 'NoneType' object is not iterable #213

Open NeoQuix opened 1 year ago

NeoQuix commented 1 year ago

What happened?

Error in bin/refsutil.exe in 0x1400ba588
[pipeline.py:107 run()] ERROR - Failed to decompile ?MapView@@YAEPEAU_MscFileObject@@PEAT_LARGE_INTEGER@@KKPEAEPEAPEAX3@Z, error during stage pattern-independent-restructuring: 'NoneType' object is not iterable
Traceback (most recent call last):
  File "/home/neoquix/Git-Repos/DeWolf/decompile.py", line 76, in <module>
    main(Decompiler)
  File "/home/neoquix/Git-Repos/DeWolf/decompiler/util/commandline.py", line 80, in main
    task = decompiler.decompile(function_name, options)
  File "/home/neoquix/Git-Repos/DeWolf/decompile.py", line 51, in decompile
    pipeline.run(task)
  File "/home/neoquix/Git-Repos/DeWolf/decompiler/pipeline/pipeline.py", line 109, in run
    raise e
  File "/home/neoquix/Git-Repos/DeWolf/decompiler/pipeline/pipeline.py", line 102, in run
    instance.run(task)
  File "/home/neoquix/Git-Repos/DeWolf/decompiler/pipeline/controlflowanalysis/restructuring.py", line 45, in run
    self.restructure_cfg()
  File "/home/neoquix/Git-Repos/DeWolf/decompiler/pipeline/controlflowanalysis/restructuring.py", line 81, in restructure_cfg
    AcyclicRegionRestructurer(self.t_cfg, self.asforest).restructure()
  File "/home/neoquix/Git-Repos/DeWolf/decompiler/pipeline/controlflowanalysis/restructuring_commons/acyclic_restructuring.py", line 44, in restructure
    self._construct_ast_for_region(restructurable_region, node)
  File "/home/neoquix/Git-Repos/DeWolf/decompiler/pipeline/controlflowanalysis/restructuring_commons/acyclic_restructuring.py", line 69, in _construct_ast_for_region
    restructured_region_root = self._construct_refined_ast(seq_node)
  File "/home/neoquix/Git-Repos/DeWolf/decompiler/pipeline/controlflowanalysis/restructuring_commons/acyclic_restructuring.py", line 92, in _construct_refined_ast
    ConditionBasedRefinement.refine(self.asforest)
  File "/home/neoquix/Git-Repos/DeWolf/decompiler/pipeline/controlflowanalysis/restructuring_commons/condition_based_refinement.py", line 35, in refine
    if_refinement._condition_based_refinement()
  File "/home/neoquix/Git-Repos/DeWolf/decompiler/pipeline/controlflowanalysis/restructuring_commons/condition_based_refinement.py", line 51, in _condition_based_refinement
    newly_added_sequence_nodes = self._structure_sequence_node(seq_node)
  File "/home/neoquix/Git-Repos/DeWolf/decompiler/pipeline/controlflowanalysis/restructuring_commons/condition_based_refinement.py", line 95, in _structure_sequence_node
    for child in list(sequence_node.children):
  File "/home/neoquix/Git-Repos/DeWolf/decompiler/structures/ast/ast_nodes.py", line 274, in children
    if set(self._sorted_children) != set(children):
TypeError: 'NoneType' object is not iterable

How to reproduce?

Decompile refsutil at 0x1400ba588.

Affected Binary Ninja Version(s)

3.3.3996

NeoQuix commented 1 year ago

Again as in #189, SeqNode has None type as children field because of NetworkXUnfeasible in reachability_graph.py L.93, caused by sequence_node._sorted_children = sibling_reachability.sorted_nodes() in condition_based_refinement.py L.79.

Wasn't the fix, that the networkx error should not result in the first place? Maybe instead of returning None, simply crash with an error.

ebehner commented 1 year ago

I found the transformation after which we can not sort the children anymore. However, the problem behind this seems to be more complicated.

We do an "invalid" transformation in https://github.com/fkie-cad/dewolf/blob/bdb0eacbfc335b0a6f69e419e80b3a84801ff411/decompiler/pipeline/controlflowanalysis/restructuring_commons/ast_processor.py#L146. More precisely, in this function we search for a condition node cn with two branches tb and fb, where exactly one ends with a return statement. For simplicity, let the true branch tb end with a return. Then we extract the false-branch fb from the condition node and make sure that the condition node, now only having the true branch, is executed before the false-branch child. We do this by updating the reachability of the code-nodes. In general, this is no problem, because nodes with opposite conditions can not reach each other. However, in this sample, all code nodes that are contained in the false-branch reach the once in the true-branch, thus after the extraction, we add that all code nodes contained in the true-branch reach the once in the false-branch.

One solution would be to remove the reachability between the code-nodes of the true and false branch of a condition node. These should never be able to reach each other.

However, something like this should not happen, and it happens in this sample only because the SSA-form seems to be incorrect. The condition *var_0 == 0 occurs three times, where var_0 has the same SSA-value rbx#4 in all cases, but the value *var_0 changes once. We do not capture this with the SSA-form. Thus, to give all three conditions the same symbol is false.

For reference, the initial cfg after the lifting start_cfg

and the cfg before the restructuring cfg

How do we want to solve this?

ebehner commented 1 year ago
mm4rks commented 12 months ago

duplicate with minimal sample https://github.com/fkie-cad/dewolf/issues/337