angr / angr-management

The official angr GUI.
BSD 2-Clause "Simplified" License
886 stars 110 forks source link

Question about blank nodes under calls in proximity view #1128

Closed agnosticlines closed 10 months ago

agnosticlines commented 10 months ago

Question

Hey there,

Sorry for all the issues, I've been exploring the angr-management project!

I noticed in the proximity view for a relatively "normal" binary there's a lot of blank nodes such as these: KfUTRndXhU

If I click the function call it centres on the disassembly calling the function, but if I click the blank box it seems to centre on the instruction underneath, is this a bug or is it intended, if so what is this used to represent? If it is a bug what would you expect to be in the nodes here, or would you just expect them to be removed

Thanks!

ltfish commented 10 months ago

It is definitely a bug :) I think they should be removed. But none of us have investigated the root cause yet.

agnosticlines commented 10 months ago

It is definitely a bug :) I think they should be removed. But none of us have investigated the root cause yet.

Hey fish, so it seems like the issue is an empty BaseProxiNode is added by _process_decompilation because there are unhandled ailment statements

        # Keep all default handlers, but overwrite necessary ones:
        bw = AILBlockWalker()
        bw.stmt_handlers[ailment.Stmt.Call] = _handle_Call
        bw.expr_handlers[ailment.Stmt.Call] = _handle_CallExpr

        # Custom Graph walker, go through AIL edges
        for ail_edge in ail_graph.edges:
            new_edge = ()
            # Handle both blocks in edge
            for block in ail_edge:
                bw.walk(block)

                # Check to see if our custom _handle's already dealt with the nodes
                if self.handled_stmts:
                    # Add each handled stmt to the graph in a linear fashion []->[]->[]
                    # Linear is fine since stmts come from the same block and are handled in order of appearance
                    for idx, current in enumerate(self.handled_stmts):
                        # Skip Head of List to avoid out of bounds error
                        if idx > 0:
                            graph.add_edge(self.handled_stmts[idx - 1], current)
                    # If first block in edge, LAST handled node -> next node
                    # Else (second block in edge), prev node -> FIRST handled node
                    if block == ail_edge[0]:
                        proxi_node = self.handled_stmts[-1]
                    else:
                        proxi_node = self.handled_stmts[0]

                    # Clear the handled stmts for next block
                    self.handled_stmts = []

                # Block stmts weren't handled, make a simple empty proxi node
                else:
                    proxi_node = BaseProxiNode(ProxiNodeTypes.Empty, {block.addr})
                    print(f"Created BaseProxiNode for block at address {block.addr}: {proxi_node}"

This debug print added:

Created BaseProxiNode for block at address 5368713788: <angr.analyses.proximity_graph.BaseProxiNode object at 0x0000021FCE77CD10>

Going back through the AILBlockWalker and adding debug prints:

Processing statement of type: <class 'ailment.statement.Assignment'>
Processing statement of type: <class 'ailment.statement.Assignment'>
Processing statement of type: <class 'ailment.statement.ConditionalJump'>

The missing statement types seem to be:

Nodes that seem to be conditional jumps point to the first instruction in the basic block which is a movsxd although they do have a jump in that basic block, so I'm not sure if I'm misunderstanding or if they're Store/Assignment blocks instead, and if they are ConditionalJump blocks they clearly have other statements associated with them, so which should win, or should they all be represented?

Keep in mind this is my first time properly looking at the angr source, so this may be wrong but it's the result of an hour or two of debugging to try and fix, I'm not sure if the answer is just to implement nodes for all the statements that don't have one though as you may not want these in the ProximityView, if you have more guidance on how to handle this I'm happy to take a look at it :)

ltfish commented 10 months ago

Hi!

Actually, we can have multiple statements in each block. It is a misunderstanding that there can only be one statement per block. I think we should display all statements (that survived optimizations in Clinic in angr). So this way there would be no empty blocks in the final graph.

Please let me know if my explanation above makes sense!

ltfish commented 10 months ago

I take back whatever I said. I did not realize that you were talking about the proximity view. For proximity view, it's totally fine to remove the empty nodes (and related edges). Please submit a PR if you are interested in taking the job!