nikic / php-ast

Extension exposing PHP 7 abstract syntax tree
Other
931 stars 79 forks source link

[rant] Why are Concrete Syntax Trees so neglected in the world of PHP parsers? #14

Closed rulatir closed 9 years ago

rulatir commented 9 years ago

Every PHP parser I can find only offers Abstract Syntax Tree.

This is poor.

CSTs are absolutely necessary for perfect reconstructibility, and perfect reconstructibility is absolutely necessary if the code transformation to be implemented MUST preserve line numbers. All those AST pretty printers around will completely mangle line numbers.

flavius commented 9 years ago

Shameless plug: when they're not neglected, the parser is never finished: https://github.com/flavius/phpmeta

nikic commented 9 years ago

This project exposes PHP's internal AST. As it is only used during compilation, we have no use for a CST -- it would only require more memory and make usage harder.

You can perform perfect reconstruction using an AST with sufficient amounts of location information (which this AST does not have, we might implement complete location tracking in PHP 7.1). The trick is to directly do source code modifications based on the AST (I usually do this via a mutable string helper that queues modifications to preserve absolute offsets). It's definitely not great, but I've found it to work well for my purposes.

There is also https://github.com/grom358/pharborist, which I think provides a CST. I'm not sure about this, I never looked into it due to the GPL licensing.