joxeankoret / diaphora

Diaphora, the most advanced Free and Open Source program diffing tool.
http://diaphora.re
GNU Affero General Public License v3.0
3.68k stars 375 forks source link

Question: automated diffing? #70

Open msm-code opened 7 years ago

msm-code commented 7 years ago

Sorry if it's explained somewhere in documentation/on the internet (i really didn't find anything), but is it possible to automate diaphora? Without manually running IDA, preferably from command line.

I mean something like writing shell script, that will diff two provided binaries and export the results to (optimally) png image / pdf / text code diff or (at least) some database?

(I used to have something automated for BinDiff, but I prefer diaphora for many reasons)

joxeankoret commented 7 years ago

Yep, you can. See this https://github.com/joxeankoret/diaphora/issues/62 You will need to run IDA in batch mode after setting the environment variables, something like this:

$ export DIAPHORA_AUTO=1
$ export DIAPHORA_EXPORT_FILE=/path/to/your/new_db.sqlite
$ export DIAPHORA_USE_DECOMPILER=1 # optionally, use the decompiler
$ idaq -A -B -Sdiaphora.py your_binary

Then, for auto-diffing, check the mentioned issue.

msm-code commented 7 years ago

It's working, thanks.

Is it possible to automatically export all non-matching function/asm diffs to some machine-readable format (or different files)?

I browsed through the code, and although I didn't find that functionality (I may be wrong though), it looks like this could be quite easy to implement using for example HtmlFormatter - am I right?

If yes, do you think this is something useful for Diaphora (I'll probably have to create someting like that anyway)?

joxeankoret commented 7 years ago

Uhm... right now, the diffing results can be exported to a .sqlite database, and it can be automated. That should be enough. Is this what you're asking for or am I mistaken?

msm-code commented 7 years ago

In a way, I wanted to take that .sqlite database, and export each function from diff to human readable html (similarly to "diff pseudo-code" function in IDA plugin).

joxeankoret commented 7 years ago

Ah, that's different. There is no functionality for this right now, but could be added without too much effort, I think. Noted. Although, due to health problems in my family, I will not be able to do it soon.

msm-code commented 7 years ago

Sure, I'll look into the code and see if I'll be able to do it myself. I'll update you with my results.

Thanks!

joxeankoret commented 7 years ago

Thanks to you!

CeGenreDeChat commented 5 years ago

Hi, I have some problems to do automated diffing. I explain : First I use directly "diaphora.py -o output db1 db2" plugin but the plugin seems does't use decompiler so the results are not good enough. The use of the environment variables just create a sql base of the binary. You say "Then, for auto-diffing, check the mentioned issue." but I didn't find this issue to do auto-diffing and using the ida decompiler. ?

joxeankoret commented 5 years ago

Hey @LeChatt ! I'm having some troubles understanding your issue, but I think I know what happens, check this list, please:

Remember that fort the decompilation output to be used, it must be set at export time, not a diffing time.

PS: If that's not the issue you are having, get back to me but explain it a bit more because I would be frankly lost.

CeGenreDeChat commented 5 years ago

Yes my question was not very understanding. I give you context :

But my problem is that the result are very different, that if I use directly IDA interface (opening IDA, load diaphora.py script, filling "export IDA databse to SQLite", filling "export SQLite database to diff against", check "Use the decompiler if available" and some other options). The diff is better when I use IDA interface.

I checked my environment variable (DIAPHORA_USE_DECOMPILER=1) and the good ida/idat, for 32bits, and ida64/idat64 for 64bits.

joxeankoret commented 5 years ago

Aaah, now I understand. I guess that there is some difference between the different default options when using the GUI and when doing it in batch mode. Let me take a look to it and I will get back to you.

CeGenreDeChat commented 5 years ago

Thank to your answer, I succeed to find where was the difference. I tried to change options inside the main of diaphora.py if do_diff: bd = CBinDiff(db1) bd.ignore_all_names = False Set ignore_all_name to False give me the same result than with the GUI (with the default configuration).

joxeankoret commented 5 years ago

I have applied the patch: https://github.com/joxeankoret/diaphora/commit/21a1486109c735aa84d15626f14db5b63229f655

Thanks for reporting!