GrammaTech / ddisasm

A fast and accurate disassembler
https://grammatech.github.io/ddisasm/
GNU Affero General Public License v3.0
645 stars 60 forks source link

[Question] Any simple way to get function entries? #61

Closed tbbatbb closed 1 year ago

tbbatbb commented 1 year ago

Hi, thank you for making DDisasm open source. I'm trying to identify function boundaries with DDisasm, the expected result should be like

name, head, tail
func1, 0, 420
func2, 424, 516
...

At present, I disassemble the binary and then grep the string @function from the result. But I can not get the address of the function in this way. Is there any better way to get function boundaries? 😃

adamjseitz commented 1 year ago

Instead of generating assembly files directly, you'll want to generate a GTIRB IR file with ddisasm. GrammaTech has a number of Python packages available for parsing GTIRB files with scripts.

Here's a sample script for getting function entrypoints:

#!/usr/bin/env python3
# print_function_entry_points.py

import argparse
from gtirb import IR
from gtirb_functions.function import Function

def main() -> None:
    ap = argparse.ArgumentParser(description="Show functions in GTIRB")
    ap.add_argument("infile")

    args = ap.parse_args()
    ir = IR.load_protobuf(args.infile)

    for m in ir.modules:
        fns = Function.build_functions(m)

        for fn in fns:
            entrypoint_addrs = [hex(block.address) for block in fn.get_entry_blocks()]
            print(fn.get_name(), entrypoint_addrs)

if __name__ == "__main__":
    main()
$ ddisasm --ir my_binary.gtirb my_binary
...
$ pip3 install gtirb gtirb_functions
...
$ python3 print_gtirb_entry_points.py my_file.gtirb
_start ['0x1060']
frame_dummy ['0x1140']
main ['0x1174']
FUN_1040 ['0x1040']
_init ['0x1000']
FUN_1020 ['0x1020']
_fini ['0x1218']
deregister_tm_clones ['0x1090']
FUN_1050 ['0x1050']
__libc_csu_init ['0x11a0']
__do_global_dtors_aux ['0x1100']
fun ['0x1149']
register_tm_clones ['0x10c0']
__libc_csu_fini ['0x1210']
tbbatbb commented 1 year ago

Whoa, thanks for your example, I'll try it later! 😄