pfalcon / ScratchABlock

Yet another crippled decompiler project
https://github.com/EiNSTeiN-/decompiler/issues/9#issuecomment-103221200
GNU General Public License v3.0
103 stars 23 forks source link

Write a tool to convert arbitrary CFG to a PseudoC program #32

Closed pfalcon closed 5 years ago

pfalcon commented 5 years ago

There're many graph example floating around for various graph algos. At the same time, ScratchABlock standardized on PseudoC programs as its input. It would be useful to be able to take description of an arbitrary graph, and output a PseudoC program whose CFG corresponds to a graph.

pfalcon commented 5 years ago

Thought about that for some time, and 12775ac75f7b951459ea7b82b811754525e119e8 just confirms that manual conversion leads to goofs.

@maximumspatium, Please see if you'd be interested to help here.

maximumspatium commented 5 years ago

I'm game. Which format should we consider as input? Dot?

Should the resulting PseudoC program contain only control statements or is there a need for generating some additional actions (assignments, calls whatsoever) in-between?

pfalcon commented 5 years ago

Thanks!

Yep, having affinity to dot is good idea IMHO. I mean, not just:

A -> B
C -> D

but also

A -> B -> C
D -> E

any other dot boilerplate can be just skipped.

Should the resulting PseudoC program contain only control statements or is there a need for generating some additional actions (assignments, calls whatsoever) in-between?

If you look at the manual conversion example linked above, I used nop() for some reason, but I don't pledge it's needed. Minimal form apparently, as long as it's clear where later to add dataflow statements.

pfalcon commented 5 years ago

And to clarify, I'd expect it to favor full-thru basic block semantics. I.e., sort nodes lexicographically and if there's edge between consecutive nodes, goto is not needed there. Or otherwise code will be too eye-bleeding to look at IMHO.

pfalcon commented 5 years ago

https://github.com/pfalcon/ScratchABlock/commit/bc7a33f78eaf699a6a9395bcb3ffa397d2819332 should help with this task.

mewmew commented 5 years ago

@maximumspatium and @pfalcon, is still still worked on? I'd be curios to learn from it :)

pfalcon commented 5 years ago

is still still worked on? I'd be curios to learn from it :)

Well, I didn't look into it myself again yet, and didn't hear from @maximumspatium. It should be a pretty simple script anyway ;-).

maximumspatium commented 5 years ago

@maximumspatium and @pfalcon, is still still worked on? I'd be curios to learn from it :)

There is a prototype I didn't touched since December 2018 due to lack of time. I'll look into it during the next days. Thanks for remembering me...

pfalcon commented 5 years ago

Ping @maximumspatium, if you'd like to finish this. Otherwise, I'm sharpening my teeth to bite at SSA again, so might hack up something myself in surprise.

maximumspatium commented 5 years ago

Ping @maximumspatium, if you'd like to finish this

It's on the way. Hoping to deliver it tonight...

maximumspatium commented 5 years ago

@pfalcon I have a couple of graphs for testing purposes, pulled from various books, but it would be nice to have more. Do you have smth you want to get converted next?

pfalcon commented 5 years ago

Well, the idea is kinda opposite, to have such a tool handy for when you see a cute graph pictured somewhere. But yes, if you're keen to know, I remembered about it because I'm reading a funny thesis called bboissin-thesis-2010-09-22.pdf, it has some graphs which would be nice to try.

pfalcon commented 5 years ago

Unfortunately, the tool implemented in #35 didn't work correctly on all cases. This is partly due to deficiency in PseudoC, which didn't have a control flow construct with multiple (>2) successors, as required to represent arbitrary graphs. A generalized "if" statement, accepting multiple conditions, and optional "else" clause was added to rectify that, and described in https://github.com/pfalcon/ScratchABlock/blob/master/docs/PseudoC-spec.md#jumps

As the tool would need changes for that, I also took a chance to reimplement it without external dependencies, using existing ScratchABlock modules, and a parser for subset of .dot format in 10 lines of code. A few regression tests are added (examples from the older tool converted to tests), though more are definitely needed.

Closing the issue.