ircmaxell / php-compiler

A compiler. For PHP
MIT License
794 stars 33 forks source link

RFC: Stay VM/JIT, or abandon and go only AOT? #63

Open ircmaxell opened 5 years ago

ircmaxell commented 5 years ago

Currently, this project targets being a drop-in replacement for PHP-SRC, implementing a JIT compiler and virtual machine, as well as AOT compilation. This would allow users to drop-in replace their current systems with no deployment mechanism change (just like HHVM did). If they want to change the deployment mechanism they can to achieve some gains, but it would work out-of-the-box.

However, AOT compilation requires a few steps that JIT/VM do not, and JIT/VM would need to make some assumptions that AOT does not. For example:

Proposal

The idea that I'm thinking about is removing the VM from the compiler. There would then be two modes: JIT and AOT.

However JIT mode would be very different from what people normally consider JIT. It would be basically "AOT Compile and Execute In Process" vs the AOT mode which is "AOT Compile and generate an executable to be run later".

The JIT mode would allow something that "feels" like normal PHP (just run jit file.php and it will execute just like PHP-SRC does), but takes a lot longer to boot (since it's doing a full compilation pass). But once it's booted, it's running native compiled code.

It also would allow the AOT mode to embed the JIT mode in the executable to handle cases of dynamic code (again, eval(), etc). So if compiled code called eval, internally in the executable it would compile the evaled code to native and then run it inside the process.

The prime benefits would be to greatly simply the compiler as well as focus on only the major differentiating mode from PHP-SRC. The prime detriment would be losing the ability to "drop-in" replace PHP-SRC (which would really hurt adoption significantly).

Request for Comments

The change from a VM primary compiler to a AOT primary compiler has a lot of side-effects. It makes it so the project doesn't aim to be a drop-in replacement for PHP-SRC. However, by doing that, it removes performance constraints around the compile step and opens the door to significant potential performance gains for compiled code (at execution phase).

So, I'd like to ask for comments and votes. Should the project aim to be a drop-in replacement for PHP-SRC? Or should we ditch the dynamic aspect and go for pure performance at the expense of compile time and deployment mechanisms.

Please vote with "reactions": Rocket ship means move to AOT, eyes for keep the VM.

PurHur commented 5 years ago

Im (yet) not an expert in this field but have my opinions.

0) If you got a working compiler i want to compile nothing else. 1) Let the php-src guys do their "normal" JIT. This projects hasnt enough drag to get that started on a full php replacement. 2) I said it has to be just as easy as php-src to run but changing the deployment is acceptable for me. The person whos changing the php bin does well know what he is doing there. 3) The time span in which the "full replace" solution would work is insane. (:D hopefully not.)

Changing the JIT to the "interpreter" sounds good! The compiled code could be cached right?

Girgias commented 5 years ago

Hey,

First of all thanks for pouring hours into this (semi-ridiculous) project :)

Now in my mind, I would prefer that this doesn't try to be a drop-in replace system but just a compiler which compiles PHP into an executable. As I don't think trying to keep up with a VM which mimics php/src is sensible as it will always be "better".

Also, I feel having an AOT (/ the JIT version of that) would add a new layer and usage cases for PHP (desktop apps come to mind).

Now I don't know if you choose to go this route it would be easier to implement all the features from the core engine or if this would still mean reimplementing everything.

Anyway just my opinion do whatever you think is best :)

ghost commented 5 years ago

Personally I'd like to go the AOT with "dynamic JIT call" (for eval, dynamic calls, etc.). I think this project should really focus on being a compiler instead of drop-in replacement and let php-src do their normal VM with JIT thing.

tomwalder commented 5 years ago

“Do one thing and do it well”

Crell commented 5 years ago

The odds of being a viable drop-in replacement for php-src are, frankly, low. It's a lot of code, a lot of work, a lot of edge cases, and probably very little benefit to the typical user. The a-typical user is the one that's going to benefit from compiled code, and if they're in that sort of oddball case they should be OK with an oddball deploy process.

php-src is pretty damned fast as is and getting steadily faster, so marginal gains there are probably not worth the effort.

driusan commented 5 years ago

IMO the value of having a JIT mode is lowered by the fact that PHP ~7~ 8 will have a jit mode with the "real" php implementation anyways. On the other hand, there's a reasonably good possibility that an LLVM backend would produce better JIT code than PHP since the backend implementation is more mature and highly optimized, but I don't think we even have any benchmarks to back that up. It's also easier for new contributors to get a grasp on the the VM code than the AOT code and the macro system.

So I've decided to vote "Confused face" for "I can't decide."

dstogov commented 5 years ago

For big projects, the "compile everything" mode in PHP 8 JIT, currently, produces code that is less efficient than VM interpreter. Actually, the code is better, but because of big code base, CPU suffers from iCache misses and spends more cycles in front end stalls.

After few years of work in this area, I'm very sceptical about AOT compilers for dynamic languages.

ircmaxell commented 5 years ago

The one thing that makes me curious about the AOT approach is that it doesn't need to be good for all code. Meaning, if it doesn't target drop in replacement of Zend PHP for all cases, then the cases where performance is worse due to cache locality just wouldn't matter as much.

So projects like Wordpress wouldn't really be candidates, but something like Composer would be a perfect one. Relatively resource heavy, and limited in scope.

Though, there is one other option, make the AOT compiler only target a strict subset of valid PHP. Something akin to asm.js or rpython. Disallowing overly dynamic code, eval, certain type changes, references, etc. Make all other code a compile error, and attempt to specialize towards that code...

Crell commented 5 years ago

If you wanted to target a strict subset of PHP, that would create a demand for a tool (either in php-compiler or separately) to transpile all of PHP down to just that subset, which is what asm.js does. How practical such a tool would be I don't know, but that would be the next logical step.

PurHur commented 5 years ago

I wonder if there would be really a benefit if eval would be supported. If php goes the direction of getting a more robust language then there would be a RFC, in 1 year, to kill that feature anyway.

ghost commented 5 years ago

Why should eval be killed? Other languages have it too. It's not required to kill eval in order to be robust.

ircmaxell commented 5 years ago

It wouldn't just be eval. It would be:

among many others.

BackEndTea commented 5 years ago

So it would be PHP without allowing you to things that you (mostly) shouldn't do

windrunner414 commented 5 years ago

I'm doing something similar(thanks to your llvm binding). both AOT and JIT need to be compile to llvm bytecode first. But I changed my mind, using AOT for dynamic language may not be a good idea

ircmaxell commented 5 years ago

Just for a status check, I have a few flights coming up, and plan on starting a major refactor to rip out the VM, and re-architect how the compiler works. I will be moving it to an "AOT" primary compiler, and won't really support JIT for the time being (but will try not to architect that as impossible)...

azjezz commented 5 years ago

It wouldn't just be eval. It would be:

  • func_get_args
  • Reflection
  • eval
  • Variable Variables (including variable class, method, and properties)
  • variable includes/requires
  • references other than for built-in functions
  • non-strict mode
  • non-explicit array initialization (function foo() { $a[] = 1; })
  • argument count mismatch (too many args)
  • using undeclared properties

@ircmaxell just to note, HHVM already killed all of these for Hack, except for Reflection ( to be replaced in the future )

also, eval, require, require_once, include, and include_once are not supported in repo auth mode.

see : https://hhvm.com/blog/2019/02/11/hhvm-4.0.0.html ( and other 4+ release notes )

muglug commented 5 years ago

AOT requires a mechanism for traversing an entire codebase to find all classes/files that could be required, and compiling them all. This would also require a mechanism to tell the compiler to not compile certain files.

Psalm does this, and I think Phan does too - every possible candidate file that might pertain to the analysed code is visited before any analysis starts. Composer's classmap is used for lookup.

It wouldn't just be eval. It would be ...

These are things that could be enforced with a static analysis tool and, selfishly, an AOT mode might benefit Psalm massively.

Could extensions be supported?

rkyoku commented 4 years ago

FWIW, Phalcon offers Zephir which allows to compile code and use it as linked libraries.