GrammaTech / ddisasm

A fast and accurate disassembler
https://grammatech.github.io/ddisasm/
GNU Affero General Public License v3.0
645 stars 60 forks source link

[Question] How to get the analyzed instruction boundaries from Ddisasm #36

Closed ZhangZhuoSJTU closed 2 years ago

ZhangZhuoSJTU commented 2 years ago

Hi,

Thanks for your great efforts of bringing datalog disassembly to the community.

I am trying to make ddisasm a disassembly backend of my own project, where the instruction boundaries of the original ELF are needed (i.e., the disassembly result). I am wondering whether ddisasm can directly output the disassembly results (like what ghidra/IDA Pro does), or some coding upon the codebase is required.

Thanks!

aeflores commented 2 years ago

If you want assembly listings, ddisasm will output those with the option --asm, see ddisasm --help.

If you want code boundaries, it would be relatively simple to obtain the boundaries of all basic code blocks using the GTIRB api https://github.com/GrammaTech/gtirb e.g.:

import gtirb
ir_library = gtirb.IR.load_protobuf("ex.gtirb")
m = ir_library.modules[0]
for codeblock in m.code_blocks:
    print(codeblock.address,codeblock.address+codeblock.size)

where ex.gtirb is the gtirb file generated by ddisasm.

If you want individual instruction boundaries, you can use https://github.com/GrammaTech/gtirb-capstone to get the instructions for each of the code basic blocks. E.g.:


import gtirb
import gtirb_capstone
ir_library = gtirb.IR.load_protobuf("ex.gtirb")
m = ir_library.modules[0]
decoder = gtirb_capstone.rewriting.GtirbInstructionDecoder(gtirb.Module.ISA.X64)
for codeblock in m.code_blocks:
    for insn in decoder.get_instructions(codeblock):
        print(insn.address)

I hope that helps!

ZhangZhuoSJTU commented 2 years ago

@aeflores Thanks! It looks great! Appreciate your prompt reply and kind help.