phpv8 / v8js

V8 Javascript Engine for PHP — This PHP extension embeds the Google V8 Javascript Engine
http://pecl.php.net/package/v8js
MIT License
1.83k stars 200 forks source link

Modules caching not working properly #140

Closed pinepain closed 9 years ago

pinepain commented 9 years ago

In general:

  1. require() function doesn't use cached loaded modules.
  2. If we fix that, loaded modules may be changed after they loaded and required (incl. it cached value), so caching become useless.

TL;DR; Original problem described and explained in SO question (with perfect answer) - Using char* as a key in std::map.

Current definition is std::map<char *, v8js_persistent_obj_t> modules_loaded; at v8js_class.h#L46,

so the following check if (c->modules_loaded.count(normalised_module_id) > 0) { at v8js_methods.cc#L244 will always fail while normalised_module_id points to dynamically allocated memory.

All above, surprisingly, gives us a valid clean object each time we call require() js method.

When we pass comparator

struct cmp_str
{
  bool operator()(char const *a, char const *b) const
  {
    return strcmp(a, b) < 0;
  }
};

to make loaded modules looks like std::map<char *, v8js_persistent_obj_t, cmp_str> modules_loaded;,

then require() function return previously stored object, BUT following test will fail:

--TEST--
Test V8 require() js function 
--SKIPIF--
<?php
if(!function_exists('json_encode')) {
    die('SKIP');
}
require_once(dirname(__FILE__) . '/skipif.inc');
?>
--FILE--
<?php

$code['test.js'] = 'var out = {"foo" : "unchanged"}; exports.test = out; exports';

$JS = '
var test = require("test.js").test;

print(test.foo, "\n");
test.foo = "changed";
print(test.foo, "\n");

var test2 = require("test.js").test;

print(test2.foo, "\n");
';

$v8 = new V8Js();
$v8->setModuleLoader(function ($module) use (&$code) {
    if (isset($code[$module])) {
        return $code[$module];
    }

    return 'print(' . json_encode($module . PHP_EOL) . ');';
});

$v8->executeString($JS, 'module.js');
?>
--EXPECT--
unchanged
changed
unchanged

It will return

unchanged
changed
changed

which may be interpreted (and it will) by users who face with it as pure magic.

Moreover, freeing map's key here

    c->modules_loaded[normalised_module_id].Reset(isolate, newobj);
    info.GetReturnValue().Set(newobj);
    efree(normalised_module_id);

after using it is wrong while comparator will fail trying to read key string stored under char *.

same here,

    if (c->modules_loaded.count(normalised_module_id) > 0) {
        efree(normalised_module_id);
        efree(normalised_path);

        v8::Persistent<v8::Object> newobj;
        newobj.Reset(isolate, c->modules_loaded[normalised_module_id]);

freeing value and later using it to get value from map.

The question in general is about what to do with require() js method which clearly works, but does slightly different things from what it originally declares. Maybe using v8::Objec::Clone() will fits caching needs better?

stesie commented 9 years ago

sorry for not answering earlier, but anyways thanks @pinepain for raising the debate

To be honest I've never really used V8Js' module stuff much before, apart from some testing. But already noticed some of the memory usage issues before; apart from the ones you mentioned, there is some reliance on PATH_MAX without bounds checking, etc.

So to give it a go, I'd strive for Node.js compliance, that said ...


==> foo.js <==
var test = require("./bar.js").test;

console.log(test.foo);
test.foo = "changed";
console.log(test.foo);

var test2 = require("./bar.js").test;

console.log(test2.foo);

==> bar.js <==
var out = {"foo" : "unchanged"};
exports.test = out;

result

unchanged
changed
changed

So at least they are fine with what you consider "pure magic to the user" :-)

Regarding your proposal to clone the objects returned by require, ... I don't think that solves all problems. Suppose you have a module that has some private variables within it and provides a public API by means of exports, if you just clone the exports object you gain nothing, as you don't clone the private internal variables.

That is I think you'd have to clone the whole context, which isn't supported by V8 as far as I can see (at least v8::Context doesn't have a clone member).

So if we strive for backwards compatibility it'd probably be easiest to allow to disable module caching (what actually is what we're doing now) -- so I'm uncertain if that's feature creep or a necessity.

pinepain commented 9 years ago

The idea is that code seems to be designed to cache modules but it doesn't. It doesn't look like it was done intentionally, so question is more about what should be desired behavior. For me Node.js way to cache modules based on their resolved filename looks good (ok, that "pure magic" was just for me after digging code related to caching). As to cloning, yes, there are a lot of hidden places to fail, so probably fixing caching would be minimal and least painful change to do, sure, if it's OK to break BC.