phpv8 / v8js

V8 Javascript Engine for PHP — This PHP extension embeds the Google V8 Javascript Engine
http://pecl.php.net/package/v8js
MIT License
1.83k stars 200 forks source link

Possibility to Improve Performance with Precompiled Templates/Classes ? #205

Closed virgofx closed 8 years ago

virgofx commented 8 years ago

Would it be possible to precompile templates and classes with an additional method(s) such that performance improvements could be made for server side rendering?

A couple of examples...

1) Cache React Library Loading the initial V8Js() with the executeString of the current React + ReactServer library takes ~60ms on my current vagrant machine [Ubuntu Trusty, Nginx, PHP56, 2GB RAM, V8-4.4, with snapshot), ~110ms with current V8-5.0.81-no-shapshot. If this is used in high production it would be great to either somehow serialize this instance or have capability to clone first and serialize (leave it up to caller to handle caching technique -- files, memory backed data store) for high throughput

2) Cache Template/Class Collection Would it be possible to precompile classes such that when rendering components with props generated load times would be faster? For small components render times averaged between 1 to 20ms.

Thoughts? Any other areas to potentially improve performance?

kynx commented 8 years ago

Have you looked at registering React as an extension (3rd argument to __construct or via registerExtension())?

My understanding is that extensions registered this way will last as long as the process does. So if Nginx works like apache, that will mean as long as the worker process is serving requests. Not quite the same as an op-code cache available to all processes, but worth a shot.

virgofx commented 8 years ago

@kynx I have not looked at registerExtension() functionality. However, being that phpv8 is an extension for PHP I'm still a little confused how that might benefit the typical FPM SAPIs which typically serve the backend requests for most production PHP stacks in a short lived [ nginx -> spawn FPM/handle request/return response/close ] situation.

I'm not sure that hooking into Nginx/Apache would provide any benefit as the dynamic data comes from PHP; unless PHP could somehow utilize a loaded version of React (#1 above) from the FPM processes.

I would think that serialization would provide the greatest immediate benefit with the least amount of modification to the stack as developers could then cache the React Library (#1) above.

I was also wondering if there would be a way to even precompile JSX templates, without data (props/state) such that performing a render-serverside (or even client side) with data could be even faster?

virgofx commented 8 years ago

I've confirmed that using registerExtension() allows script data to be cached within PHP-FPM (on a per-process basis); however, there is 0 performance gain since the script is not compiled. It essentially just loads it in via executeScript() at construction of a new V8Js instance.

Since there is 0 performance gain with registerExtension(), do you know if it would be possible to cache at some level after the V8 engine has actually compiled script?

@stesie Do you know if this is possible? It would open the door for a larger trend in server rendering.

stesie commented 8 years ago

I think this all is a mixture of different questions, so let me pick and answer one after another ...

V8 Engine Startup Timing

Regarding the mod_php/fpm SAPI handling: opposed to the CLI & CGI SAPIs, those SAPIs handle multiple requests, however just one request after another (in non-ZTS setups parallel requests are handled by multiple processes). Nevertheless V8 itself is currently initialized upon first V8Js object construction, subsequent object constructions (possibly from different requests) then can take advantage of the already initialized V8 engine.

This is, there already should be an (timewise) enormous difference if you do the rendering twice within the same process, since the first try starts up V8 itself + compiles/executes the script. The second try solely compiles/executes the script.

So far you've just provided numbers for the first case, not the latter. You might want to give that a try to give a clue of potential time saving

A possible action would be that we change the extension to (optionally) spin up the V8 engine when the PHP extension is loaded. This way V8 would already be started when Apache/FPM spin up spare server processes (i.e. before handling requests) The downside of that of course is that each and every process has an initialized V8 engine (which might be a problem if you have high traffic but low use of V8)

Extensions

to make it short: forget about them :)

Personally I think they do more harm then good, after all you just provide JavaScript code upfront which is shared amongst requests (as they are handled internally by V8 then) ... and they're simply re-evaluated upon instanciation, almost like if just just executeString them ...

Unbound scripts

V8 supports so-called unbound scripts (http://v8.paulfryzel.com/docs/master/classv8_1_1_unbound_script.html) which are compiled but neither bound to a context nor executed -- that might be an option to gain performance.

However PHP-V8Js currently uses V8::Isolates to isolate multiple V8Js (class) instances from one another and V8::Contexts to isolate different commonjs modules from another. Those unbound scripts are not bound to contexts, yet bound to isolates. Sharing an isolate over multiple requests probably is a problem by itself.

To get a feeling of how much could be gained that way, just run all your sources through compileString with time measuring

Heap snapshots

V8 supports snapshots (and up to version 4.4 they were working with shared libraries) ... and if you're compiling with snapshots it automatically generates a snapshot and links it right into the V8 library. However the mksnapshot tool run during V8 build allows to pass further JavaScript code that shall be baked into the snapshot. However I've never done that (but sounds promising & fun)

But be aware that every V8Js context then has the code baked into it and you cannot have an instance without it (even over multiple requests on a single process).

The V8 C++ API allows to do snapshotting & loading on run time so you don't have to actually compile it into V8 library itself, but do it from PHP side ... (however the sharing over multiple requests still applies)

Of course the crashing-problem of recent V8 versions built with snapshots enabled has to be solved then. Sticking with version 4.4 is not a wise option as it gets no security support anymore. A possible way out would (probably) be to statically link V8 and PHP and do snapshotting of that heap then. (but I didn't try that so far)

virgofx commented 8 years ago

@stesie Thanks for the awesome feedback. I've somewhat came to the same conclusions as you. My thoughts in exploring the V8JS world with PHP:

RegisterExtension You're right .. "worthless". In fact, when taking a look at the code, it actually requires slightly more work because of the extra checks at instantiation as well as more work in PHP land to ensure that application code properly checks to see if the extension is loaded.

Repeat Renders Yes, I didn't mention that in my original post, but you're absolutely right - Repeat renders of new/existing components appear to be very fast. When I rendered 10 different (similarly sized) components my timings looked like:

React Library + App Components~ 65 ms
Render Component 1: ~ 8ms
Render Component 2: ~ 2ms
Render Components 3-10: 0.5 - 1ms

So while satisfied with the overall render performance, my only concern was about the worst actor -- the library instantiation for the relatively unchanging library + components. Sure they may change on subsequent builds; however, any cache clearing, rebuilding, precompiling phases would handle that.

Snapshots Right off the bat I noticed improvements from 120ms to 60ms when compiling with the heap snapshot feature. You raise a very interesting point of compiling with the mksnapshot feature to include pre-compiled context data. Obviously, this would be very beneficial for libraries and other script (obviously, understanding the risk of namespace collision) for FPM processes that load V8JS each instance. I will try to explore this.

Do you have any information on how to do this or references to documentation?

A side note. I was concerned about having to downgrade to 4.4.x to get this feature to work. There was very little documentation about snapshots in terms of performance as well as any missing "security" components or other performance benefits or rational for using later versions. Also, were there any plans to have snapshots working in current versions?

Unbound Scripts I took a look at these. In general, I'm not sure about any performance benefits. When testing compiling of script, it is very very fast. So being able to bind a script to a context doesn't seem like it would have any huge performance benefits.

This same concept goes for moving the V8JS registerExtension from being process specific to PHP specific. The gains during the loading of the initial empty context are very small. It's the execution of library code within a specific context which is where the 60ms timing (in my case) occurs. Additionally, this would hurt all shared hosting scenarios that would want empty/default contexts.

Regardless, if it were somehow possible to cache the V8JS isolates with rendered library code via some identifier, that would be huge. If it can't be done because of the internals of V8Js would it be possible to somehow serialize the isolate/contexts such that application caching could be implemented?

stesie commented 8 years ago

Repeat Renders

I'm not sure whether we talk past each other regarding timing. In pseudo code:

$timeA = microtime(true);
$v8 = new V8Js();

$timeB = microtime(true);
$v8->executeString("... react library + components ");

$timeC = microtime(true);
$v8->executeString("... render component 1 ... ");

$timeD = microtime(true);

... as I understand your last comment $timeC - $timeA is ~65ms and $timeD - $timeC is ~8ms.

I actually would be interested in $timeB - $timeA and $timeC - $timeB. And as a variation of that: the exact same measurement with $ignoreMe = new V8Js(); added right before the assignment of $timeA.

The latter could be compensated by changing the extension to immediately initialize V8 (instead of doing so lazily)

snapshots

I haven't tried building V8 as a shared library with snapshots enabled lately, it was back when 4.5.x or maybe 4.6.x were bleeding edge. Actually I never found the time to really investigate what the problem was, nor did I report to V8 devs. Anyhow I suspect the problem lies with V8 itself, i.e. that it's not V8Js' fault that it crashes. I think so since if I compiled V8 with snapshots enabled and as a shared library, then even their own programs like d8 et al failed (i.e. immediately segfault).

Also see https://bugs.chromium.org/p/v8/issues/detail?id=4192

Regarding old versions you might consider it if you really know what code you're running -- or stated the other way round: never use old versions if you rely on V8Js as a sandboxing tool to run untrusted/customer code as there are no updates (security fixes) for V8 older then 4.8 (currenty stable version).

Snapshotting generally only influences startup speed (as you noticed 110ms vs. 60ms), but shouldn't effect runtime performance thereafter.

I haven't tried out so far, but a few pointers

to apply them later on

I don't know which is the better option, probably the former one as it seems to allow to choose the snapshot on an isolate by isolate base.

serialization generally

I don't know of any way to serialize, cache, clone or whatever an isolate. The only thing I know of are said snapshots.

stesie commented 8 years ago

Hmmm, having given it another try it indeed is V8Js' fault...

If I modify v8js_v8_init to call

v8::V8::InitializeExternalStartupData("/home/stesie/Projekte/v8/out/native/");

before initializing the platform as well as the library, everything works fine (with V8 version 5.0) :smiley_cat:

snapshot quick start

create file doublify.js

function doublify(x) {
  return 2 * x;
}

... and create a snapshot with it embedded:

stesie@hahnschaaf:~/Projekte/v8/out/native$ ./mksnapshot --startup_blob=custom_snapshot.bin doublify.js
Embedding extra script: doublify.js

Then modify said v8js_v8_init function of V8Js:

    v8::V8::InitializeExternalStartupData(
        "/home/stesie/Projekte/v8/out/native/natives_blob.bin",
        "/home/stesie/Projekte/v8/out/native/custom_snapshot.bin"
    );  

make ...

stesie@hahnschaaf:~/Projekte/v8js$ php  -n -d extension_dir=./modules -d extension=v8js.so -d extension=readline.so -a
Interactive mode enabled

php > $v8 = new V8Js();
php > $v8->executeString('print(doublify(23)); ');
46

... and there it is :-)

virgofx commented 8 years ago

Here is timing information:

Construct: 1.8561 ms
Execute Library Code: 53.4339 ms
Execute Component #1: 8.0600 ms
Execute Component #2: 1.0581 ms
Execute Component #3: 0.3939 ms
Execute Component #4: 0.4709 ms
// $appScript = React.js + ReactServer.js + App Components + window/console globals

$t = microtime(true);
$v8js = new V8Js();
echo ('Construct: ' . number_format((microtime(true) - $t) * 1000, 4) . ' ms') . PHP_EOL;

$t = microtime(true);
$v8js->executeString($appScript);
echo ('Execute Library Code: ' . number_format((microtime(true) - $t) * 1000, 4) . ' ms') . PHP_EOL;

$t = microtime(true);
$v8js->executeString('ReactDOMServer.renderToString(React.createElement(Table,' . $jsonEncodedProps . '))');
echo ('Execute Component #1: ' . number_format((microtime(true) - $t) * 1000, 4) . ' ms') . PHP_EOL;

$t = microtime(true);
$v8js->executeString('ReactDOMServer.renderToString(React.createElement(Table,' . $jsonEncodedProps2 . '))');
echo ('Execute Component #2: ' .number_format ((microtime(true) - $t) * 1000, 4) . ' ms') . PHP_EOL;;

$t = microtime(true);
$v8js->executeString('ReactDOMServer.renderToString(React.createElement(Table,' . $jsonEncodedProps3 . '))');
echo ('Execute Component #3: ' .number_format ((microtime(true) - $t) * 1000, 4) . ' ms') . PHP_EOL;

$t = microtime(true);
$v8js->executeString('ReactDOMServer.renderToString(React.createElement(Table,' . $jsonEncodedProps4 . '))');
echo ('Execute Component #4: ' .number_format ((microtime(true) - $t) * 1000, 4) . ' ms') . PHP_EOL;
virgofx commented 8 years ago

@stesie I'm going to try to to build with snapshot of the react library if possible to see if I can reduce that 60ms by compiling extension manually with your suggestions and include in my above timing. If the snapshot heap reduces significantly, would it be possible to update V8JS __contruct() api to include snapshot (or another method)?

stesie commented 8 years ago

Sure we can add some stuff to allow for custom snapshots. Before doing so I'd like to further test whether snapshots can be applied on an isolate by isolate base and then provide the API accordingly.

I'd even consider to generate snapshots with V8Js, ... so you can create these, cache the result and re-use as needed. Even so it might be wise to support "global" custom snapshots as well so you need not pass the full snapshot on every request ...

Already looking forward to the snapshot-based timings ... exciting ... :)

virgofx commented 8 years ago

Sweet. Yes, generating snapshots via V8Js would be even easier (less work for deployment/CI build process). And global snapshots would also be awesome. Pass in the snapshot, and register is full extension wise (as mentioned earlier). Will keep you posted, stay tuned.

virgofx commented 8 years ago

Unable to compare with V8-v4.4.9.1 as /vagrant/v8js/v8js_v8.cc:68:5: error: 'InitializeExternalStartupData' is not a member of 'v8::V8' error. Will try latest version (similar to you) to see if this was introduced in later version.

stesie commented 8 years ago

so I was curious enough now to give it a try on my own ...

I've picked https://github.com/reactjs/react-php-v8js/tree/master/example and did some performance measurements, see this chart:

chart

I've tried with two different V8 versions with snapshots disabled and the 5.0.104 from yesterday with snapshots enabled. Once just with the "out of the box" snapshot and one with react included (excluding app components; which is just a table component in the example's case)

Stuff to learn from that

//var GLOBAL_MOUNT_POINT_MAX = Math.pow(2, 53);
var counter = 4906291055034368; // hard-coded random number FTW

var ServerReactRootIndex = {
  createReactRootIndex: function () {
    return counter ++;
    //return Math.ceil(Math.random() * GLOBAL_MOUNT_POINT_MAX);
  }
};

So it looks really promising :-)

stesie commented 8 years ago

http://v8project.blogspot.de/2015/09/custom-startup-snapshots.html

virgofx commented 8 years ago

The custom snapshots results look very amazing. Are the timings you provided in milliseconds as well? I'm still trying to get the snapshot pull to compile locally, once I do I'll compare as well against my baseline snapshot test above.

stesie commented 8 years ago

yeah right, those are milliseconds too. The values themselves are averages over 100 samples each.

virgofx commented 8 years ago

TIMING RESULTS

4.4.9.1 - With Snapshot

Using default snapshot and loading React library + application via executeString()

Construct: 1.8561 ms
Execute Library Code: 53.4339 ms
Execute Component #1: 8.0600 ms
Execute Component #2: 1.0581 ms
Execute Component #3: 0.3939 ms
Execute Component #4: 0.4709 ms

Total: ~ 65.2 ms

5.0.104 - With Snapshot, V8JS-PR #207

Using mksnapshot with React (Modified - Removal of Math.randoms()), ReactServer, + Application

Construct: 5.1351 ms
// Library Code + App code [Unminified] inside snapshot ^
Execute Component #1: 13.8218 ms
Execute Component #2: 1.5359 ms
Execute Component #3: 1.2491 ms
Execute Component #4: 1.7309 ms

Total: ~ 23.35 ms

Construct: 6.5620 ms
// Library Code + App code [Mininified] inside snapshot ^
Execute Component #1: 9.7649 ms
Execute Component #2: 0.7520 ms
Execute Component #3: 0.5841 ms
Execute Component #4: 0.5529 ms

Total: ~ 18.03 ms

Woot Woot ✔

virgofx commented 8 years ago

@stesie Thanks for the fabulous work integrating snapshot support and removing the 4.4.x snapshot dependency! Using V8 with V8JS is now highly optimized for large library/applications :+1:

stesie commented 8 years ago

you're welcome! And thanks to you, @virgofx, as well, for debating, testing and pushing me forward :) I'll release 0.5.0 after supper and upload it to PECL.

Would you like to set the phpv8-stubs stuff up? I'd create a account under phpv8 org then and grant you member rights on that one then.

stesie commented 8 years ago
virgofx commented 8 years ago

Yes, I can handle that, feel free to create repo and grant me rights. I'll do everything up to current and once you tag 5.0 , then I'll update stubs accordingly.

stesie commented 8 years ago

@virgofx just created the repo and granted you the rights (don't know whether Github sends mail notifications, probably yes ...)

aftabnaveed commented 6 years ago

Although this issue is old and closed I just wanted to add my two cents here. With nginx -> spawn FPM/handle request/return response/close the V8Js is re-initialized with each request. But thanks to the ReactPHP and PHP-PM https://github.com/php-pm/php-pm it is now possible to keep the application and therefore V8JS in memory for subsequent requests. This, in my opinion, will eliminate the V8JS initialization altogether. I yet have to try this but I am optimistic about the conclusion. @stesie do you think it would be a better idea to keep the V8JS in memory for a long period of time? I am hoping there are no memory leaks.

virgofx commented 6 years ago

This approach is similar to making the extension persistent within FPM -- which can be done at the extension level without requiring any other php-pm ... making it faster ... which would be the way to do it. Similar to how the new mongo PHP extension keeps connections persistent.

stesie commented 6 years ago

@aftabnaveed if you use php-v8js on php-fpm it does not reinitialize V8 on each and every request. What indeed is created on every request (and any subsequent call to new V8Js actually) is a so-called isolate. This is the de-facto sandbox you get with JavaScript and sharing it from request to request wouldn't make much sense (as otherwise subsequent requests might see data that previous executions have left in global JavaScript environment)