nodejs / roadmap

This repository and working group has been retired.
135 stars 42 forks source link

Should Node.js be VM neutral in the future? #54

Closed mikeal closed 2 years ago

mikeal commented 8 years ago

First and foremost, a bit of a reality check: The "Node.js Platform" is already available on a variety of VMs other than the V8 runtime we ship with:

Because of Node.js' massive ecosystem of packages, educational materials, and mind-share we should expect that in the future this will continue. Part of Node.js going everywhere is that it may need to be on other VMs in environments V8 can't go. There's not much that we can do to prevent this.

So the question becomes: Should Node.js Core move towards being VM neutral and supporting more VMs in the main project?

Some of the advantages would be:

There's a long discussion about how to do this. Without guarantees from all the target VM vendors that they will support this neutral API it could fall on us to make that work. Historically V8 has made drastic API changes that were not supportable through the API nan had implemented.

There's also an open question about how to structure the tree and build and if we should continue sticking V8 and every supported VM in a vendor directory or pulling it in during build time.

Anyway, let's have the discussion :)

@nodejs/ctc

obastemur commented 8 years ago

@aruneshchandra

From ChakraCore’s perspective, we are willing to support the development of a neutral APIs and are ready to have members of our team participate in the implementation efforts.

For e.g. ChakraCore has a dual execution pipeline, with a mature standalone interpreter, which is relatively easy to port to systems with different instruction sets than a JIT only based VM

I hope things work well.

I guess, multiple teams are trying to achieve similar goals.

In case of success. We (jxcore) could stop updating whole framework + multiple VMs in order to run node 4/5/6?/7?... apps on other platforms. So the next version of JXcore could be a mobile tooling around latest node.js. Even better, if node could inherit some internal stuff we do, no tooling needed at all.

I hope naysayers will take the real potential into consideration.

ugate commented 8 years ago

@trevnorris @mikeal Any way that node could utilize what's being done by WebAssembly for a binding point?

rumkin commented 8 years ago

Let decide:

  1. How much engines should be supported with ABI? One, two, more?
  2. Who will support ABI's underlying level: nodejs community or vendor?
  3. Which engines are preferable and why?
  4. How long could we work without ABI?

My answers:

  1. If we have more than one then we talk about Abstract JavaScript Engine. Let's make JavaScript engines embeddable and maintainable. There is a lot of products which need this too like electron, nw or nginx and will join us.
  2. It's hard to imagine to me if we will support all engines.
  3. I think we should to specify node.js goals for nearest future to answer this question.
  4. Don't know but I think more than a half of year. So we could to make deliberate decision. There is no reason for haste. All we need now is to decide to do it or not. And when we are planning to start.
trendzetter commented 8 years ago

Multiple VM's are not needed as far as I can tell, only microsoft needs it to extend it's influence and save their failing attempts to displace android. It seems to me microsoft is trowing a lot of money on taking over node for it's windows universal apps. If we move along the rest will suffer.

trevnorris commented 8 years ago

@ugate That pattern would basically match my proposal of making a low-level JS API. Where each VM maintains it's own code connected to that API. Other module authors then could also use web assembly as a native binding. Though realistically I'm not sure any of that will happen any time soon.

ugate commented 8 years ago

@trevnorris Okay, thanks for the clarification. It sounds like a better alternative than native API mangling. It's a very interesting proposal. I'm interested in seeing more details.

orangemocha commented 8 years ago

I think it's great to see the discussion around the possible approaches to achieve VM neutrality. But given that all of these approaches entail quite massive undertakings, and will require a lot of experimentation and prototyping, I doubt that we will be able to make enough progress on an actual solution in the context of this thread.

May I suggest that we frame the discussion in this issue into: 1) Resolving whether multiple VMs in Node is something that we see value in. 2) What constraints (e.g. pitfalls to avoid) need to be met for a given solution to be acceptable.

.. and then turn that into a mandate for the API WG to further develop a solution / proposal that meets the constraints?

orangemocha commented 8 years ago

One more potential advantage of this effort, as a side effect of a normalized native API, would be to be able to distribute native modules in precompiled form, which in turn could reduce or eliminate a host of node-gyp support issues.

Where is V8 not supported?

iOS comes to mind.

ghost commented 8 years ago

iOS comes to mind

And Windows ARM where we, the "third party" folks, have a weird restriction that we can either do the codegen in our code or execute it but not both! For details, see: https://bugs.chromium.org/p/v8/issues/detail?id=2427 and this is still valid for Windows 10. The same reason why can't we have JVM on Windows ARM to run Java code without pre compiling to native code. Apple has similar restrictions like Microsoft. Basically the latter took huge inspirations from the former, regardless of good or bad choices former had made in consumer's opinion (if that matters at all). One can argue that is for the "security" of the (eco?)system, but that is equally good for taking control over on what consumer can or can't do with their purchased merchandise. Android however does right by users in this area.

luanmuniz commented 8 years ago

@trevnorris I'm with you, this low-level abstraction api is a good idea. I'm not sure if i can figure out how this api will work, but maybe it can help all the VMs to get in sync with features and compatibility.

Im not sure what is the status of the VM's today, but would like to see a VM that have the same political organization as Node.js, something that is behind a community more than a company and use that in Node.js. I know this is far from feasible at this time but maybe node.js own vm is something to think about.

LPGhatguy commented 8 years ago

Would it be desirable to build a new JS VM with the common JS VM API to help steer its development? It would make a massive undertaking probably implausible, but it would provide a reference implementation and make sure the correct design choices are made.

Zayelion commented 8 years ago

@LPGhatguy that seems appropriate but extreme, each VM has a company attached to it, what you are asking would be like herding cats. The foundation could direct such an effort but I dont see it working out without paid developers. You are asking that we make purposely inferior code to V8 or Chakra I feel and/or putting the opensource community up against corprations. V8 is already sorta the open option so this feels smart but weird to me.

I'm saying its a good idea but it doesn't feel doable, or maintainable due to social reasons alone.

wwahammy commented 8 years ago

Coming to this discussion after the fact I skimmed the comments but I thought I'd add my two cents. I'm a huge fan of extensibility in open source but the advantages here seem really minimal and the risks exceedingly high.

Much of the discussion seems to be related to problems integrating with V8, particularly over changes. Has there been any consideration of abandoning the concept of developing a common API and simply forking V8 into Node's custom JS engine? The needs of V8 and Node intersect in some places but in many they're widely different. So why not just abandon the connection? It has less risk to the community and guarantees that Node has full control over its own future.

bnoordhuis commented 8 years ago

Has there been any consideration of abandoning the concept of developing a common API and simply forking V8 into Node's custom JS engine?

It's been discussed but it's unrealistic. V8 is huge, moves fast and has a high barrier to entry. If you're not already a compiler developer, your ramp-up time is going to be measured in years.

kobalicek commented 8 years ago

Any progress on JS-engine neutrality?

I'm asking as I started developing a neutral API for myself, so I may share it when I have some results. But, if there is anybody who developed already something functional I would take a look first. Thanks.

Fishrock123 commented 8 years ago

@kobalicek see https://github.com/nodejs/vm :)

kobalicek commented 8 years ago

Thanks for the link, I overlooked it I guess.

However, I see no code, so I will continue with my approach.

Fishrock123 commented 8 years ago

@kobalicek Right, work hadn't started yet, we'll probably try to round up some VM implementor people soon to poke it to getting started.

If you wouldn't mind, could you post your progress? The idea was to start absolute minimal on the "shim", and probably base it off v8 (which has higher-level abstractions than say, Chakra).

By minimal, I mean only the following (at the start):

kobalicek commented 8 years ago

@Fishrock123 I have already implemented bindings interface on top of V8 called JSNI (not public atm), but it uses preprocessor magic excessively. So now I'm trying a different approach by abstracting the API and defining a declarative approach on top of it. My very initial idea to start JSNI was to avoid the amount of boilerplate that was necessary to use V8 (I didn't think of VM neutrality back then).

When the discussion about VM neutrality started, I started thinking on how to achieve something similar to JSNI, but without using macros at all. I call my new prototype NJS (NativeJS) and I can ping here when I make it working for my current project (which requires a bit more than the minimal matrix you posted).

mhdawson commented 8 years ago

@kobalicek there are plans coming out of the discussion last week from https://github.com/jasnell/vm-summit for @ianwjhalliday and @stefanmbu to get started on an API. There will be a summary of the overall meeting coming out soon and one of the next steps is to schedule an API WG (https://github.com/nodejs/api) meeting to present the current plan from the summit and to see if there are others who also have time to contribute to the effort. Sounds like we should somehow sync what you are doing with this effort.

kobalicek commented 8 years ago

@mhdawson I checked out the documents, thanks for pointing them out!

What I'm missing here is that there is a lot of questions, but there is no analysis of existing VMs. I don't even see which VMs would be the target. I'm writing this as I have studied SpiderMonkey, V8, and ChakraCore so far, and these engines really have different APIs and handle types. I think that whoever defines the initial neutral API proposal should be really aware of all the differences these engines have. Supporting simpler engines like duktape shouldn't be an issue as they have much simpler APIs and usually have only one type of handle.

My observations so far:

Basically only based on "my observations" I can answer some questions asked:

Well, sorry for long reply, just wanted to share some of my observations.

stefanmb commented 8 years ago

@mhdawson The API that @ianwjhalliday and I are planning is a module API. This is related to the VM agnostic problem but not the same thing, as established during https://github.com/jasnell/vm-summit, if anything it is closer to an evolution of Nan.

@kobalicek The detailed list of concerns is very useful, thanks. There have been several other attempts at shimming and it is likely worthwhile examining them as well. Here are the ones I am aware of: https://github.com/jxcore/jxcore/blob/master/doc/native/Embedding_API_Details.md https://github.com/martine/v8c/blob/v8c/include/v8c.h https://github.com/tjfontaine/node-addon-layer/blob/master/include/shim.h

Fishrock123 commented 8 years ago

See also https://github.com/nodejs/vm/issues/1

kobalicek commented 8 years ago

I think another question that should be answered is about performance. If the Neutral API wraps the underlying engine completely (aka jxcore approach or v8c approach) then there will be some overhead.

For example here https://github.com/tjfontaine/node-addon-layer/blob/master/include/shim.h I don't really like the dynamic memory allocations that happens inside the implementation. I really think the wrapper should be thin and shouldn't need to allocate additional memory. VM architects have already taken care of it.

kzc commented 8 years ago

My 2 cents regarding the engine neutral API - macros and C++ templates do not lend themselves to stable long term ABIs. A goal of this project should be to be able to use the same node native module shared library against any engine (v8, chakra or other) on the same platform without recompilation. Engine state should not leak out into the API. C linkage provides the most stable ABI and is compiler agnostic on the same platform. But the Qt/KDE projects have also proven that this can be done with a restricted subset of C++ for the same compiler.

https://community.kde.org/Policies/Binary_Compatibility_Issues_With_C%2B%2B

kobalicek commented 8 years ago

@kzc Sorry, but I think that some C++ magic would actually help to create something that can be stable and useful in long-term. Interacting with VM's that use concurrent garbage collection isn't easy and C++ actually really helps here. Even SpiderMonkey moved from C to C++, because it makes the interaction with the VM easier. If I take into consideration that V8 and SM already provide C++ API then proposing something C-only doesn't make much sense here. That's my 2 cents :)

kzc commented 8 years ago

@kobalicek My point was having a stable ABI whether it is C or C++. One that remains backwards compatible without the need for native module recompilation for new node releases - even with new engines supported. Compile-time solutions are not as useful or convenient for users.

mhdawson commented 8 years ago

Sorry for causing some confusion. I should have made it clear the API work I mentioned is focused initially on modules. It was discussed that it might become part of the solution for the vm layer integration but that was to be seen later on.

I do believe that for native modules they need to be able to be used without recompilation with different Node binaries. Similarly, although I acknowledge there will be challenges, I also think we should target being able to use different engines with the same Node binary without recompilation. This has some desirable characteristics and will help ensure that engine internals don't leak out.

kobalicek commented 8 years ago

@mhdawson I just wonder what is node gonna expose if you plan to make it ABI compatible?

bobmcwhirter commented 8 years ago

fwiw, the progress on Nodyn stalled because of the lack of VM neutralness.

Should this become a reality, Nodyn may certainly be re-invested in.

lance commented 8 years ago

@ariya et. al. I am the creator of Nodyn, and just stumbled on this issue recently. There were a lot of reasons I (and Red Hat) stopped development on Nodyn. As @bobmcwhirter noted, a lack of VM neutrality was one. Another was the fact that we were just winging it - working with no documentation other than the Node.js source code.

I've now seen https://github.com/nodejs/vm and will follow along.

@trevnorris

Another option that I've investigated and found promising is creating a lower level JS API that the existing API can sit on top of. Then the binding point for the VM is on the JS layer, which reduces the native API problem to the public API.

+1 👍

DemiMarie commented 8 years ago

One approach is to ditch native modules entirely and replace them with an FFI. However, there is a catch: an FFI would need to be implemented in each VM and integrated with the VM's JIT and GC in order to have good performance. Furthermore, an FFI is almost useless for browsers (lack of isolation in JS ensures that this would lead to security problems).

The big advantages of a C API (vs. C++) are:

mhdawson commented 8 years ago

@DemiMarie you may want to check out the latest API working group meeting. I've not written up the notes but the recording is available and link to raw notes is in the meeting issue as well:

https://github.com/nodejs/api/issues/22

I'm suggesting this as there is work on the stable ABI (well just for modules at this point). We are heading down the path of a C API with C++ sugar on top.

ianwjhalliday commented 8 years ago

@DemiMarie agreed. FFI does have its interesting merits, but I think mostly as a convenience to the end users, and I don't know if the effort is worth the benefit. @ofrobots is experimenting with this idea. I am interested to see what he comes up with.

jinderek commented 8 years ago

@kobalicek I think the overhead may can not be avoided NOW. The vms are already mature and different, we have to do some sacrifice in the implementation.

kobalicek commented 8 years ago

@mhdawson on So you wrap C++ API in C API and then create a C++ layer to provide a sugar on top of the C API. For me this seems like a step back and moving backward, sorry.

@xuyv Talking about overhead without having the overhead measured is premature. If you wrap every interaction with V8 that was now inlined into a function call then the overhead can be even 5x/10x, which doesn't seem like a good idea, especially if you consider that node.js is a high-performance environment.

BTW: @mhdawson - I think you are doing it the opposite - if you decide to change the API you should first start with node internals and then propagate these changes to node modules, and not to expose the new thing (that will be highly untested and unstable in the early beginning) into all the modules that depend on V8 now and keep the old API internally. Another reason is that the new API should really be able to do everything the old API does.

I just hope I will be able to use V8 in the future and keep my modules high-performance, others are fine to use the new API :)

mhdawson commented 8 years ago

@kobalicek the challenge is that it may just not be possible to maintain ABI stability while using C++. Ian found this document which covers some of the issues: http://www.oracle.com/technetwork/articles/servers-storage-dev/stablecplusplusabi-333927.html.

The net seems to be that the stable ABI would be in C, but C++ wrappers that are all in-lined and only use the stable ABI are possible because the C++ would be built into the module and then module could continue to work provided the exports in C are ABI stable.

Discussion around whether to change the internal use/module use was extensive at the vm summit with people landing on starting with the modules first. The discussion was that the ABI stable API could address 90-95% of native modules, while those with specific requirements could continue to use the v8 APIs directly.

We are at the point of starting to assess the performance overhead in these issues (still early days though so take them with a grain of salt):

So that is definitely part of the analysis.

kobalicek commented 8 years ago

If we will be allowed to use V8 directly then I have no problem with that. I personally consider wrapping this in a C API (with a lot of external functions) a significant overhead that I just can't accept. I have already tried to make a C++ wrapper around V8 API in a way to hide everything, my attempt is available here: https://github.com/kobalicek/njs

The thing is, from my own experience, that wrapping VM APIs in a low-level way is much more complicated than creating higher level interface, because their low-level API is the thing that is the most different. For example compare how you wrap classes in V8 and in SpiderMonkey - both engines provide a completely different way of doing that thing.

The way I use NJS looks like this (sorry for a bit long-code, showing only one simple class):

// C++ header (allows other modules to use that in other C++ code).
struct JSLinearGradient : public JSGradient {
  NJS_INHERIT_CLASS(JSLinearGradient, JSGradient, "LinearGradient")

  NJS_INLINE JSLinearGradient(double x0, double y0, double x1, double y1) NJS_NOEXCEPT
    : JSGradient(b2d::Gradient::kTypeLinear, x0, y0, x1, y1) {}
};

// C++ source (implementation).
NJS_BIND_CLASS(JSLinearGradient) {
  NJS_BIND_CONSTRUCTOR() {
    unsigned int argc = ctx.ArgumentsCount();

    double x0, y0, x1, y1;
    if (argc == 0) {
      x0 = y0 = x1 = y1 = 0.0;
    }
    else if (argc == 4) {
      NJS_CHECK(ctx.UnpackArgument(0, x0));
      NJS_CHECK(ctx.UnpackArgument(1, y0));
      NJS_CHECK(ctx.UnpackArgument(2, x1));
      NJS_CHECK(ctx.UnpackArgument(3, y1));
    }
    else {
      return ctx.InvalidArgumentsCount();
    }

    JSLinearGradient* self = new(std::nothrow) JSLinearGradient(x0, y0, x1, y1);
    ctx.Wrap(ctx.This(), self);
    return ctx.Return(ctx.This());
  }

  NJS_BIND_GET(x0) {
    return ctx.Return(self->_obj.getValue(b2d::Gradient::kScalarIdLinearX0));
  }

  NJS_BIND_SET(x0) {
    double x0;
    NJS_CHECK(ctx.UnpackValue(x0));
    self->_obj.setValue(b2d::Gradient::kScalarIdLinearX0, x0);
    return njs::kResultOk;
  }

  NJS_BIND_GET(y0) {
    return ctx.Return(self->_obj.getValue(b2d::Gradient::kScalarIdLinearY0));
  }

  NJS_BIND_SET(y0) {
    double y0;
    NJS_CHECK(ctx.UnpackValue(y0));
    self->_obj.setValue(b2d::Gradient::kScalarIdLinearY0, y0);
    return njs::kResultOk;
  }

  NJS_BIND_GET(x1) {
    return ctx.Return(self->_obj.getValue(b2d::Gradient::kScalarIdLinearX1));
  }

  NJS_BIND_SET(x1) {
    double x1;
    NJS_CHECK(ctx.UnpackValue(x1));
    self->_obj.setValue(b2d::Gradient::kScalarIdLinearX1, x1);
    return njs::kResultOk;
  }

  NJS_BIND_GET(y1) {
    return ctx.Return(self->_obj.getValue(b2d::Gradient::kScalarIdLinearY1));
  }

  NJS_BIND_SET(y1) {
    double y1;
    NJS_CHECK(ctx.UnpackValue(y1));
    self->_obj.setValue(b2d::Gradient::kScalarIdLinearY1, y1);
    return njs::kResultOk;
  }
};

The code itself has zero overhead as it doesn't have to call external functions, everything possible is inlined, everything that would expand is marked NJS_NOINLINE to expand only once in a binary. Maybe I can discuss my way with some nodejs dev somewhere? Not sure if this is the right thread.

DemiMarie commented 8 years ago

I think that the overhead of a stable ABI may just be too large (one good reason to have a JIT-integrated FFI). Indeed, most languages have abandoned ABIs, and instead required transitive recompilation of reverse dependencies

We need to do benchmarks of JS->native and native->JS calls and other API use.

If one is to have a stable ABI, it needs to still be fast enough that people don't need to drop down to the VM native API, defeating the purpose. One option is for the C wrapper to be built with the VM and compiled using link-time optimization (LTO).

bnoordhuis commented 8 years ago

The code itself has zero overhead as it doesn't have to call external functions, everything possible is inlined, everything that would expand is marked NJS_NOINLINE to expand only once in a binary.

That sounds like the approach that nan takes but that is not good enough for what we're discussing here because it only maintains source compatibility, not binary compatibility (API vs. ABI.)

kobalicek commented 8 years ago

It's not exactly what NaN does - NaN builds on top of V8, NJS defines interface which is then provided by V8-integration layer. But yes, it's not ABI, it's a source-level compatibility layer.

DemiMarie commented 8 years ago

A rather wild option:

What about using libclang and LLVM to JIT compile C++ add-ons at run-time?

ianwjhalliday commented 8 years ago

Hmm, no. That wouldn't help with the problem of API/ABI breakage with different versions of node/v8. I believe that would be more or less equivalent to the status quo, albeit deferring compilation to runtime which would impact performance. It would may also have issues on Windows, not sure how good libclang's MSVC compatibility is.

DemiMarie commented 8 years ago

It would solve the ABI problem by doing requiring that source always be available. The compiled code could (and should) be put in a persistent cache, so the performance penalty could be minimized.

But this is just working around the problem. I once believed that a standard FFI integrated into each engine's JIT is the solution. But it is a LOT of work that has basically no use in browsers for obvious security reasons, since JS lacks adequate encapsulation to allow writing fully safe wrappers, nor does it have in-realm access control. An FFI works great for languages like Rust, OCaml, D, C#, and Haskell that enforce encapsulation, allowing wrappers to check arguments. Doing the same in JS is nearly impossible, since JS has too much reflection.

Another issue is that any FFI will allow crashing the VM or – worse – C-style memory unsafety security vulnerabilities.

If we do use an FFI exclusively, there will need to be ways for library authors to provide safe wrappers with little performance penalty.

On Sep 13, 2016 15:19, "Ian Halliday" notifications@github.com wrote:

Hmm, no. That wouldn't help with the problem of API/ABI breakage with different versions of node/v8. I believe that would be more or less equivalent to the status quo, albeit deferring compilation to runtime which would impact performance. It would may also have issues on Windows, not sure how good libclang's MSVC compatibility is.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/nodejs/roadmap/issues/54#issuecomment-246793661, or mute the thread https://github.com/notifications/unsubscribe-auth/AGGWBzcZguKzkbi5TvCwHl-gHunBwqr3ks5qpvdcgaJpZM4HKo_M .

valera-rozuvan commented 7 years ago

A year has passed without a comment. Has this been discussed elsewhere? Will there be a common ABI for a VM to integrate itself with Node?

williamkapke commented 7 years ago

Has this been discussed elsewhere?

Yup! check out: https://github.com/nodejs/vm

Trott commented 2 years ago

Closing all issues in this archived repository.