NationalSecurityAgency / ghidra

Ghidra is a software reverse engineering (SRE) framework
https://www.nsa.gov/ghidra
Apache License 2.0
51.96k stars 5.9k forks source link

Function call sequence for each function #2134

Open sajjadirn opened 4 years ago

sajjadirn commented 4 years ago

I have this python script which extracts the function calls for each function

for f1 in currentProgram.getListing().getFunctions(True):
    fun_calls.append(str(f1.getCalledFunctions(ghidra.util.task.TaskMonitor.DUMMY)))

for function X I get an output like: __alloca, _strcmp, _printf, _scanf, ___main

Is there a way in which I get the output based on the order of function calls, for example if _strcmp is called first followed by _scanf etc I want an output like: _strcmp, _scanf, etc...

astrelsky commented 4 years ago

You would have to iterate over the functions instructions, find the corresponding call and sort by the calling instructions address. With that said it's more efficient in this situation to just go right to iterating over the instructions and avoid the duplicate work.

sajjadirn commented 4 years ago

@astrelsky Thank you for your reply, I didn't quite understand what you meant by it's more efficient in this situation to just go right to iterating over the instructions and avoid the duplicate work.

If I iterate through complete instruction set, how will I get the function call sequence for each function.

astrelsky commented 4 years ago

@astrelsky Thank you for your reply, I didn't quite understand what you meant by it's more efficient in this situation to just go right to iterating over the instructions and avoid the duplicate work.

If I iterate through complete instruction set, how will I get the function call sequence for each function.

This will not get indirect function calls that the decompiler can resolve. However, neither does Function.getCalledFunctions.

def get_calls(fun):
    for inst in currentProgram.listing.getInstructions(fun.body, True):
        if not inst.flows:
            continue
        for op in inst.pcode:
            if op.opcode != op.CALL:
                continue
            yield getFunctionAt(inst.flows[0])
            break
sajjadirn commented 4 years ago

@astrelsky Thank you again, but when I pass any function name to get_calls, it always returns None. Am I doing something wrong?

astrelsky commented 4 years ago

@astrelsky Thank you again, but when I pass any function name to get_calls, it always returns None. Am I doing something wrong?

Pass a Function not it's name. It will return a generator.

sajjadirn commented 4 years ago

@astrelsky Thank you so much!

sajjadirn commented 4 years ago

btw this does not return to calls made to external functions, for example in the assembly code, if this call is made, the script does not give this in the output CALL dword ptr [->KERNEL32.DLL::GetFileAttributesA ]

Do you know, if this can be extracted too? Thank you @astrelsky

astrelsky commented 4 years ago

btw this does not return to calls made to external functions, for example in the assembly code, if this call is made, the script does not give this in the output CALL dword ptr [->KERNEL32.DLL::GetFileAttributesA ]

Do you know, if this can be extracted too? Thank you @astrelsky

Change it to the following:

def get_calls(fun):
    for inst in currentProgram.listing.getInstructions(fun.body, True):
        if not inst.flows:
            continue
        for op in inst.pcode:
            if op.opcode not in [op.CALL, op.CALLIND]:
                continue
            yield getFunctionAt(inst.flows[0])
            break
sajjadirn commented 4 years ago

Thank you so much man! this is what i needed @astrelsky