nikic / php-ast

Extension exposing PHP 7 abstract syntax tree
Other
938 stars 78 forks source link

Include this extension in the PHP core and provide a hook API #5

Open lisachenko opened 9 years ago

lisachenko commented 9 years ago

I want to suggest to include this extension into the core to provide an API for miscellaneous userland extensions. Main idea is to give a userland code control over the produced source-code.

Chain is following: Source Code => Tokenizer => AST => Engine hooks+PHP userland hooks => Compiler => Opcodes => Execution

Why it can be a great thing: native source-code AST (static analyzers, code beautifiers), language extensions (AOP, DbC, SQL-like native syntax).

This requires to create an userland-function, probably, register_parser_extension(ParserExtensionInterface $extension) that will add an extension as parser hook.

Next additional thing is ParserExtensionInterface interface with process(Node $node) method that will receive ast\Node class and should return a modified node or tree.

Each registered extension will be called consequently, allowing to transform the source code. Last result will be used by compiler to generate final version of opcodes.

lisachenko commented 9 years ago

Ping, @nikic ) Can we bump this thread to move this beautiful gem into the PHP7.1?

nikic commented 9 years ago

@lisachenko It's still a bit early for PHP 7.1 (we haven't even branched yet), I'd wait until PHP 7 is out :)

lisachenko commented 9 years ago

@nikic oh, it's ok ) Just want to see, that this topic is still interesting for you and php-AST can be moved to the PHP core in future versions.

sebastianbergmann commented 9 years ago

I, too, would love to see this bundled with and enabled by default for PHP 7.1.

GrahamCampbell commented 9 years ago

:+1:

JoyceBabu commented 9 years ago

:+1:

lisachenko commented 9 years ago

For anyone who wants this in the PHP core, here is a link to the draft version of RFC: https://wiki.php.net/rfc/parser-extension-api

funivan commented 9 years ago

:+1:

JoyceBabu commented 9 years ago

@lisachenko When Opcode caching is enabled, how will the hook API work? The same PHP file could be parsed into different Opcodes based on the bound transform functions.

lisachenko commented 9 years ago

@JoyceBabu parsing, as well as possible hooks will be called only once with enabled opcode caching. There won't be any calls to the hooks, because source code AST will be transformed into an opcodes and cached by the engine.

JoyceBabu commented 9 years ago

That is going to cause a lot of confusion to several users. Because any time varying or indeterministic code in the callback will retain the value of the first invocation.

But since this will be used mainly by advanced PHP users, proper documentation will reduce the confusion.

PS: This will be very useful when writing Templating Engines.

lisachenko commented 9 years ago

PHP with enabled opcode caching working in the same way for production mode. All files are served directly from the memory, no file checks at all, if you want to upload a new version of file (for example, symfony container with new host names, passwords, etc) then you should call opcache_invalidate() function to keep opcodes in sync with original source code.

Hook is just additional logic to transform the AST source, nothing more. What is more important: how to register that hooks, because there will be time, when several core files are loaded (without enabled hook), then hook will be registered and only after that all next files will be analysed/transformed.

I can see two ways:

msjyoo commented 8 years ago

+1

jansor commented 8 years ago

+1

jeremeamia commented 8 years ago

:+1:

asgrim commented 6 years ago

@nikic @lisachenko any ideas on whether we can push this RFC along a bit? Would love to get AST available in core.

I have a question: if integrated into core, can I still parse the same code multiple times without actually loading the functions/classes within, or would it actually load e.g. the class entry?

TysonAndre commented 6 years ago

It doesn't actually load the class entry, it just parses it.

asgrim commented 6 years ago

It doesn't actually load the class entry, it just parses it.

Yep I understand that's the current operation, I was asking if integrated into core :) thanks!

nikic commented 6 years ago

I'd expect that this would be integrated in core by simply bundling it, i.e. with the behavior entirely unchanged.

asgrim commented 6 years ago

Thanks for confirming! Any ideas if/when it might make into core?

TysonAndre commented 4 years ago

I'd definitely like to see this in core, since it would make it easier and more common to build/adopt simple and complex tools based on it, like the tools already built on tokenizer.

EDIT: But it would make writing tooling using the latest AST version(s) much, much harder if new ast versions couldn't be backported, so I doubt this would work


One possible issue with putting this in core is supporting new AST version numbers for ast\parse_code and ast\parse_file when new syntax requires breaking changes to the AST representation.

For example, in php 7.4, the new node kind AST_PROP_GROUP was added. So in AST version 70, all properties became part of an AST_PROP_GROUP when they would have been an AST_PROP_LIST in AST version 50.

If php-ast was part of the PHP core in php 7.3, it wouldn't be possible to php-ast to change in 7.3.x to support that due to needing a stable API. If a tool such as https://github.com/phan/phan was updated to support php 7.4 syntax, it would need to use AST version 70, but that would mean it couldn't run in php 7.3 (without making the AST imitate the newest AST version in a post-processing step)

  • AST_PROP_GROUP was added to support PHP 7.4's typed properties. The property visibility modifiers are now part of AST_PROP_GROUP instead of AST_PROP_DECL. Note that property group type information is only available with AST versions 70+.

Constants such as TYPE_FALSE, TYPE_STATIC, etc. would also need to be polyfilled in php 7.3 for code that supported analyzing 7.4. (Currently, code/composer.json can just require a minimum php-ast version). This is doable and implemented in Phan, but a drawback.

EDIT(2020-07-15): And in php 8.0, the attributes syntax required at least one incompatible change for AST version 80