ircmaxell / php-cfg

A Control Flow Graph implementation in PHP
MIT License
245 stars 43 forks source link

[Meta] Documentation #19

Open afraca opened 8 years ago

afraca commented 8 years ago

I know this package is still in wild development, but I'm trying to wrap my head on how this works internally to find out what information I can extract in what ways. It seems quite promising, thanks for your great work guys!

There are literally 1300 line files* with 0 comments in it. I know good code is self-documenting, but sometimes describing flow in your program is still really helpful.

nikic commented 8 years ago

This package is compiling the PHP AST into a static single assignment form intermediate representation.

The SSA construction is implemented using the algorithm described in "Simple and Efficient Construction of Static Single Assignment Form" by Braun et al. (PDF). This particular algorithm builds SSA form directly from the AST, without going through a non-SSA IR first.

I doubt this package will be getting good docs soon :)

afraca commented 8 years ago

Thanks for the info :+1: What I expected, the paper deals with some limited example language. (I haven't read these SSA papers, the whole paper does not mention what the language is, or the AST nodes?) I've seen this package run on a file with classes before, but I did not check the results.

How are things like namespaces, classes, try/catch etc. handled? Is this a custom approach of you ( @nikic and @ircmaxell ) , or a related paper? If so, would be awesome to make this addition to the implementation more explicit somewhere!

Why do you think no good docs will be coming soon? Is this some experiment which shouldn't be used by others? Is documentation not necessary because there's a whole paper discussing most of the stuff?

nikic commented 8 years ago

Namespaces are mostly handled statically (same as in PHP), i.e. names are resolved as far as possible using the current namespace and alias table. For unqualified calls and constant lookups we store the two possible name variants (e.g. https://github.com/ircmaxell/php-cfg/blob/master/lib/PHPCfg/Op/Expr/NsFuncCall.php). (Well, we do since an hour ago...)

Classes are currently modeled as normal statement nodes, see for example: https://github.com/ircmaxell/php-cfg/blob/master/test/code/class.test They should probably moved into a separate structure outside the CFG.

As to try/catch ... well, right now it's being ignored completely: https://github.com/ircmaxell/php-cfg/blob/master/lib/PHPCfg/Parser.php#L594 I don't think we'll support correct SSA for this (it would require putting every single instruction in its own basic block). Probably we should still compile this, but mark the function as inaccurate.

"Is this some experiment which shouldn't be used by others?" <= That one ;) This isn't a production-grade library...