node-forward / discussions

Soliciting ideas and feedback for community driven collaborative projects that help Node.
149 stars 1 forks source link

Why not a real alternative to Node? #9

Open mvalente opened 9 years ago

mvalente commented 9 years ago

Node is great and V8 is great.

But why not take Mozilla's Spider/Eon/Odin/Monkey and create a real alternative to Node? Namely something that would be async/callback Node-compatible but have the alternative of sync-only API to better welcome people that are coming from PHP/Python/etc ?

jamlfy commented 9 years ago

@Fishrock123 Saying that V8 is faster, it is not true. Far from the versions the latest versions of mozilla. That is a myth!

I share the same position @bnoordhuis, which is chosen at the beginning V8 to have very specific characteristics, fine. However I think it is time to assess whether it is feasible to proceed maintained V8 engine as NodeJS.

Fishrock123 commented 9 years ago

@alejonext it was faster at the time.

rlidwka commented 9 years ago

Yep, v8 was a couple of times faster than everything else at the time. And it played a big part in node.js success. There were other server-side js wrappers, but they were just too slow to matter.

But recent news from mozilla makes me wish we'd go for dual-engine support when people can choose which engine to use.

jamlfy commented 9 years ago

@rlidwka would be a similar model to this ..

 +-----------------------+
 |   NPM (node_modules)  |
 +-----------------------+
 |      NodeJs Core      |
 +-----------------------+
 |     V8    |  Mozilla  |
 +-----------------------+
othiym23 commented 9 years ago

While it would be possible to write a platform abstraction layer for Node to abstract away the differences between V8 and SpiderMonkey's embedding APIs, my totally seat-of-the-pants guess is that that code would be roughly comparable in complexity to libuv, which is to say a significant and complex project all its own. Supporting one JS engine (regardless of whether it's V8 or SpiderMonkey) is much easier than supporting two.

Fishrock123 commented 9 years ago

... or implement V8 api on top of SpiderMonkey

this has to be the most crazy idea I've heard all week haha, what with v8's pace and all.

idk maybe it'l work

vkurchatkin commented 9 years ago

@rlidwka node is really coupled with v8, so you have to implement all of v8 API, which is pretty much impossible

jamlfy commented 9 years ago

@vkurchatkin I do not think it impossible, Nor do I believe that it is so complex. In fact the current NodeJS code only uses things done in C for file system and others. For the rest, all l used nodejs layer and its libraries.

piscisaureus commented 9 years ago

I am very much in favor of supporting multiple js engines. V8 is great indeed but we don't know (and right now we can't know) if it's actually the best for typical node.js workloads.

Also, avatar.js is already a node implementation on top of a different js engine (nashorn). I don't see why spidermonkey (and javascriptcore, chakra) could not be supported. But I do realize that fork lifting node onto another engine is a lot of work, and without leniency from the node project it might never happen.

The question is how to go about it. Creating a v8-api wrapper around spidermonkey and other engines as some have suggested may seem an obvious solution, but you'll run into corner cases pretty fast. That's because you now need to map the concepts and abstractions that the "other" engine provides to v8 concepts exactly. About 3 years ago mozilla tried it (see v8money), but that went nowhere. Also, the v8 api is a moving target.

It seems that blessing a project like nan (and using it to implement core bindings) has more potential; nan can choose it's own API so it would be possible to define concepts that can be easily mapped to different engines.

I've actually done some investigative work in this area not so long ago, in an attempt to find "common ground" between v8, spidermonkey and chakra. I found that these were the most difficult issues to deal with:

GC-rooting javascript values

The v8 API allows one to root values globally with Persistent<T> and within a function scope using Local<T>. Spidermonkey nowadays has an API that's very similar (see their GC Rooting Guide) . Chakra offers reference counting only (see JsAddRef and JsRelease).

Attaching externally managed memory to a javascript object

Sometimes node needs to associate externally managed memory with a javascript object. Prime examples are Handles, Reqs and Buffers in node-core. Handles and Reqs have a c++ object instance attached to them which contains information about an open libuv handle and an pending asynchronous operation, respectively. Buffers are attached to their backing store that lives outside of javascript heap.

v8 has the SetAlignedPointerInInternalField API that lets you associate any number of pointers with a v8 object. Same goes for spidermonkey, which has js_SetReservedSlot and js_getReservedSlot. Chackra supports only one pointer per object through JsSetExternalData. This is sufficient for node.

Garbage collector notification

Continuing the previous point - external data requires integration with the garbage collector. For example node frees the backing store for Buffer when the garbage collector notifies that the associated javascript value has been GCed. There are other types of objects that use a weak callback (e.g. FSWatcher); I believe this is poor design, but that's out of scope for this exercise.

v8 supports this with by providing a "weak callback" for objects that have been made weak. The v8 API is "overkill" for node because it allows the weak callback to resurrect the object, but node never does this. I haven't figured out yet how to do it with spidermonkey. Chakra has JsSetRuntimeBeforeCollectCallback() which is similar to v8's weak callback except that it doesn't support resurrection.

The vm needs direct access to external buffer memory

As an extension to the previous point - to support an efficient Buffer implementation, the javascript engine must be able to directly use a binary data that lives outside of the javascript heap.

V8 supports this through the SetIndexedPropertiesToExternalArrayData api. Spidermonkey does support something similar through CreateTypedArrayWithBuffer, but that (obviously) creates a typed array and not a Buffer object. Chakra doesn't support this as far as I can tell. I recall from speaking with one of the avatar.js developers about half a year ago that this was a problem for them too, because at the time nashorn had no such api.

Note that technically it is not necessary for the Buffer backing store to live outside of the javascript heap; however it must not move, libuv may be using that buffer at the same time to perform an asynchronous read or write. The v8 memory manager moves objects around, and I expect other engines to do the same (or at least reserve the right to do so).

vkurchatkin commented 9 years ago

In JSC:

piscisaureus commented 9 years ago

@vkurchatkin Thanks. On external array data, is there no API to create an uint8 TypedArray?

vkurchatkin commented 9 years ago

@piscisaureus no, API is es5 only. You can evaluate new Uint8Array(%d), but no access to underlying buffer (as in v8).

Technically implementing buffer-like object is not a problem, we need a JSClass with getter callback (and a bunch of others), which dereferences private pointer and looks up data cell. Data can be freed in finalize callback.

The problem is: JSClass is used to create object with C API, and it appears to be no good way to turn JSClass into constructor.

piscisaureus commented 9 years ago

@vkurchatkin Thanks! That sounds like an acceptable workaround, although I think that turning every indexed lookup into a virtual function call won't be very performant. I wonder how Safari implements webgl though if the js api doesn't allow buffer access.

vkurchatkin commented 9 years ago

@piscisaureus this API is for embedders, webkit doesn't use any of it, as far as I know

domenic commented 9 years ago

I would think that, if people were given the feature of supporting multiple JS engines, they might be willing to pay the cost of moving from non-standard Buffers to standard typed arrays? The reason we still have Buffers is just back-compat, right? I don't remember them being technically superior, apart from being a great solution before JS typed arrays were properly standardized and implemented.

vkurchatkin commented 9 years ago

@domenic one more advantage: Buffers use uninitialized memory. It would be cool if spec made initialization optional (or a part of Annex B)

domenic commented 9 years ago

Oh, right, I forgot about that :(. Pretty hard to imagine that kind of security hole being put in the spec. Maybe some optimizations could be in place though so that it gets zeroed out just in time for use or something? Bleh, I dunno.

vkurchatkin commented 9 years ago

@domenic why is this a security hole? I mean, for browsers, it is, but ECMAScript is more than just a language for web scripts. This requirement can be put in Annex B or https://javascript.spec.whatwg.org/

domenic commented 9 years ago

Anything that allows you to read arbitrary memory is a security hole in any environment.

vkurchatkin commented 9 years ago

@domenic I don't get it. So Buffer is a security hole? or malloc?

sergioramos commented 9 years ago

About 3 years ago mozilla tried it (see v8money), but that went nowhere.

https://twitter.com/zpao/status/468860127391809536

mikeal commented 9 years ago

v8 has made breaking changes in their API enough times now that we MUST assume it will break again, and again, and probably again. simulating an API that is by any definition unstable seems like a bad idea.

bnoordhuis commented 9 years ago

I don't get it. So Buffer is a security hole?

Not exactly "is" but "can be". Imagine you store confidential data like a password in memory. When the memory is (re)used by a buffer and you have a bug in your application that transmits the buffer over the network without clearing it first, you leak confidential data.

v8 supports this with by providing a "weak callback" for objects that have been made weak. I haven't figured out yet how to do it with spidermonkey.

With SM, it's the JSClass.finalize callback of the object's backing JSClass.

v8 has the SetAlignedPointerInInternalField API that lets you associate any number of pointers with a v8 object. Same goes for spidermonkey, which has js_SetReservedSlot and js_getReservedSlot.

FWIW, I could never get this to quite match the V8 semantics, even with the undocumented JSClass magic for making an object indexable (which was super slow, by the way.)

EDIT: IIRC, the issue was that SlowBuffer.prototype.__proto__ === Buffer.prototype in v0.10, something I couldn't express with SM's class system. Perhaps not an issue with v0.12.

vkurchatkin commented 9 years ago

@bnoordhuis if you have a bug, everything can be a security hole: filesystem operations, child processes, etc. Anyway, I understand HOW this can be an issue, I just don't think it should be a part of ECMAScript spec, because it is not limited to sandboxed environments.

apaprocki commented 9 years ago

Hi, I figured I'd chime in about what we do at Bloomberg...

Re: multi-engine binding -- that is essentially what we have been doing for many years. We have an engine independent set of binding interfaces and as of right now, we have concrete implementations for V8, SM, and even Lua (to just keep us honest that it really does work abstractly for dynamic Object-like things). Our internal JS process is not a Node replacement -- we had very different goals from what Node itself provided. This binding layer is just one of the many building blocks one could use to build a Node equivalent. We write nice native modules once and they can be used in any engine that has a concrete adapter. We currently run our JS process atop either V8 or SM depending on what hardware architecture it is running on.

I want to try to get at least the binding layer open-sourced up on our GH account, but it depends on a few other types/components in our stack that are not yet available in http://github.com/bloomberg/bde. I'll continue working on getting the prerequisites released, but I can't be sure when it will happen.

mvalente commented 9 years ago

Just my 2cents https://github.com/olegp/common-node

On Tue, Nov 4, 2014 at 10:12 PM, Andrew Paprocki notifications@github.com wrote:

Hi, I figured I'd chime in about what we do at Bloomberg...

Re: multi-engine binding -- that is essentially what we have been doing for many years. We have an engine independent set of binding interfaces and as of right now, we have concrete implementations for V8, SM, and even Lua (to just keep us honest that it really does work abstractly for dynamic Object-like things). Our internal JS process is not a Node replacement -- we had very different goals from what Node itself provided. This binding layer is just one of the many building blocks one could use to build a Node equivalent. We write nice native modules once and they can be used in any engine that has a concrete adapter. We currently run our JS process atop either V8 or SM depending on what hardware architecture it is running on.

I want to try to get at least the binding layer open-sourced up on our GH account, but it depends on a few other types/components in our stack that are not yet available in http://github.com/bloomberg/bde https://github.com/bloomberg/bde. I'll continue working on getting the prerequisites released, but I can't be sure when it will happen.

— Reply to this email directly or view it on GitHub https://github.com/node-forward/discussions/issues/9#issuecomment-61724517 .

trevnorris commented 9 years ago

@piscisaureus

Note that technically it is not necessary for the Buffer backing store to live outside of the javascript heap;

Disagree. It'd be really dumb to clog up the heap with large objects. It might be a corner case, but still very useful when I can load large data sets into memory and work on them from memory.

@domenic

The reason we still have Buffers is just back-compat, right?

Impossible to switch to typed arrays because we can't extend natives, meaning we can't allow both quick indexed array access and additional methods on the same object.

@bnoordhuis

IIRC, the issue was that SlowBuffer.prototype.__proto__ === Buffer.prototype in v0.10, something I couldn't express with SM's class system. Perhaps not an issue with v0.12.

That's no longer an issue in v0.12.

piscisaureus commented 9 years ago

@trevnorris

Disagree. It'd be really dumb to clog up the heap with large objects. It might be a corner case, but still very useful when I can load large data sets into memory and work on them from memory.

That's definitely true for v8 because it has a fully managed heap with a size limit. However other engines may just malloc() space for javascript objects, so it doesn't really matter whether the vm or node manages that allocation - the net result is the same. But I agree we shouldn't haphazardly assume that buffers can live on a managed heap.

domenic commented 9 years ago

Impossible to switch to typed arrays because we can't extend natives, meaning we can't allow both quick indexed array access and additional methods on the same object.

When I say "switch," I mean in a non-API-compatible way. Given that, what additional methods do you need (that you can't get from e.g. a DataView)?

piscisaureus commented 9 years ago

@domenic

> Buffer
{ [Function: Buffer]
  isEncoding: [Function],
  poolSize: 8192,
  isBuffer: [Function: isBuffer],
  byteLength: [Function],
  concat: [Function] }
> Buffer.prototype
{ inspect: [Function: inspect],
  get: [Function: get],
  set: [Function: set],
  write: [Function],
  toJSON: [Function],
  toString: [Function],
  fill: [Function: fill],
  copy: [Function],
  slice: [Function],
  utf8Slice: [Function],
  binarySlice: [Function],
  asciiSlice: [Function],
  utf8Write: [Function],
  binaryWrite: [Function],
  asciiWrite: [Function],
  readUInt8: [Function],
  readUInt16LE: [Function],
  readUInt16BE: [Function],
  readUInt32LE: [Function],
  readUInt32BE: [Function],
  readInt8: [Function],
  readInt16LE: [Function],
  readInt16BE: [Function],
  readInt32LE: [Function],
  readInt32BE: [Function],
  readFloatLE: [Function],
  readFloatBE: [Function],
  readDoubleLE: [Function],
  readDoubleBE: [Function],
  writeUInt8: [Function],
  writeUInt16LE: [Function],
  writeUInt16BE: [Function],
  writeUInt32LE: [Function],
  writeUInt32BE: [Function],
  writeInt8: [Function],
  writeInt16LE: [Function],
  writeInt16BE: [Function],
  writeInt32LE: [Function],
  writeInt32BE: [Function],
  writeFloatLE: [Function],
  writeFloatBE: [Function],
  writeDoubleLE: [Function],
  writeDoubleBE: [Function] }
domenic commented 9 years ago

@piscisaureus yes, ArrayBuffer combined with DataView gives you all of those... except maybe not the little/big endian distinction, and you have to use text encoding APIs for the text stuff.

trevnorris commented 9 years ago

@domenic fill()? Also, coming in v0.12 is a generic (Read|Write)(U)Int(LE|BE)() that can handle 8, 16, 24, 32, 40 and 48 bit reads/writes.

TextDecoder has several pit falls. First is that it doesn't support ISO-8859-1 (AFAIK). Another is the API. Let's take a quick look at creating a buffer of arbitrary length, filling it with character values and finally reading an arbitrary section of it (doing this to the best of my knowledge of existing specs):

First Node:

// size, start, end ∈ ℕ; start + end <= size
// fill is a number or string
function genString(size, fill, start, end) {
  var b = new Buffer(size).fill(fill);
  return b.toString(start, end);
}

Now using browser APIs:

// size, start, end ∈ ℕ; start + end <= size
// fill is a number or string
function genString(size, fill, start, end) {
  var dv = new DataView(new ArrayBuffer(size));
  if (typeof fill === 'number')
    fillDVNumber(dv, fill);
  else
    fillDVString(dv, fill);
  var td = new TextDecoder('utf-8');
  return td.decode(new DataView(td.buffer, start, end));
}

function fillDVNumber(dv, num) {
  for (var i = 0; i < dv.byteLength; i++)
    dv.setUint8(i, num % 0xff);
}

function fillDVString(dv, str) {
  // not going to take the time implementing a utf-8 safe
  // string fill.
}

There are other small pains of things like needing to externalize an ArrayBuffer before external APIs can directly access its memory, and that I can't allocate uninitialized memory.

Additionally, the Buffer implementation of numeric read/write methods are much faster than those for DataView() (~8ns vs ~47ns).

Granted there are some parts of the Typed Array API I really like, but it just isn't well suited to working with data the way Node does.

domenic commented 9 years ago

@trevnorris re: fill, http://people.mozilla.org/~jorendorff/es6-draft.html#sec-%typedarray%.prototype.fill

But, in general, thanks for the detailed comparison. I still would hope that if we were starting from a world in which typed arrays already existed, we wouldn't invent Buffers. The uninitialized-memory thing seems like the only potential actual dealbreaker.

trevnorris commented 9 years ago

@domenic Thanks for the link to the specification.

One difference between that fill() implementation and what's coming in v0.12 is that:

var b = Buffer(8).fill('abcd');
console.log(b.toString());
// output: 'abcdabcd'

Minor, but still useful.

Also, was I mistaken that TextDecoder doesn't support ISO-8859-1? I hope it does.

domenic commented 9 years ago

Minor, but still useful.

Yeah, in general it seems like Buffer has more ergonomic text-manipulation stuff.

Also, was I mistaken that TextDecoder doesn't support ISO-8859-1? I hope it does.

You can decode iso-8859-1 (aka windows-1252, and lots more), but you can only encode as UTF-8 or UTF-16.

trevnorris commented 9 years ago

You can decode iso-8859-1 (aka windows-1252)

If that were the case then I wouldn't expect the following:

var b = new Buffer(1);
b[0] = 159;
// 'binary' is V8's one-byte encoding (ISO-8859-1)
b.toString('binary');
// output: 'Ÿ'
// expected: 'Ÿ'

That might just be a V8-ism, and is possible it changes. Just FYI.

but you can only encode as UTF-8 or UTF-16.

Ah, that's painful. Know if there's any discussion about extending that in the future?

domenic commented 9 years ago

That might just be a V8-ism, and is possible it changes.

Yeah, not sure what relation V8's encoding has to windows-1252/iso-8859-1/etc.

Know if there's any discussion about extending that in the future?

Not for the web; the proliferation of non-UTF-8 text is harmful (I'm hoping we can remove UTF-16 actually...) and shouldn't be convenient. You can make the argument that it's more necessary for a server-side runtime I guess, but then again you can make the same argument that non-UTF-8 encodings are harmful and should be inconvenient (i.e. require user-created encoders).

piscisaureus commented 9 years ago

TextDecoder has several pit falls. First is that it doesn't support ISO-8859-1

Well node doesn't either. The "binary" encoding is effectively latin-1.

piscisaureus commented 9 years ago

@domenic @trevnorris I think it's be better to move TypedArray/Buffer talk to a separate issue

jamlfy commented 9 years ago

In conclusion we have problems with the buffer and the character encoding? To construct an alternative

5nyper commented 9 years ago

An alternative to nose could be Dart, benchmarks prove the DartVM is faster than V8

rlidwka commented 9 years ago

An alternative to nose could be Dart, benchmarks prove the DartVM is faster than V8

I surely hope it's a joke. If not, I'll happily suggest to switch to Erlang instead. :)

feross commented 9 years ago

Yo guys, let me tell you about Haskell... It's just like node (except not really...) On Sat, Nov 22, 2014 at 10:56 PM Alex Kocharin notifications@github.com wrote:

An alternative to nose could be Dart, benchmarks prove the DartVM is faster than V8

I surely hope it's a joke. If not, I'll happily suggest to switch to Erlang. :)

— Reply to this email directly or view it on GitHub https://github.com/node-forward/discussions/issues/9#issuecomment-64082653 .

5nyper commented 9 years ago

No it isn't a joke, what I HOPE that your reply is a joke.

Fishrock123 commented 9 years ago

looks like we are switching to erlang guys

5nyper commented 9 years ago

Take a look at this then: http://www.infoq.com/news/2013/05/Dart-Java-DeltaBlue

Fishrock123 commented 9 years ago

Ok, node is named node.[javascript] for a reason. Quite frankly there are many other languages already out there for I/O (go, rust, java, etc).

I think anything that isn't about node as a javascript I/O is definitely off-topic. (Not that this thread hasn't been for a long time.)

kav commented 9 years ago

It seems that the primary confusion here is this: Node was designed as a platform for writing async server applications that happens to use javascript.

Many folks seem to think node is for writing server applications in javascript and happens to be async.

Thus they confuse the urge to remove problematic Sync constructs as zealotry when it's a totally rational move for a non-blocking server environment. Conversely they can't figure out why everyone gives them looks when they want to add a solid synchronous model. Which does make perfect sense if you just want one language on your client and server.

Node is non-blocking and async first and foremost and uses javascript secondarily. Adding a bunch of sync stuff thus doesn't really make sense from a fundamental standpoint.

This does seem to come up an awful lot. More and more as big companies join the party. Does anyone have a great resource to point to that explains the what and whys of node? Might help explain the crazy looks and perception of zealotry.

TL;DR It'd be easier to drive planes if they didn't have wings and go up in the air. But then we'd call them cars, not planes.

domenic commented 9 years ago

Node is non-blocking and async first and foremost and uses javascript secondarily.

Asserting this does not make it true.

mvalente commented 9 years ago

And then there are people who have been on the JS serverside camp for more than 6 years and who find it funny to see displays of crazy looks and zealotry which are actually proper for the newly converted.

http://mvalente.eu/?s=Tracemonkey&x=13&y=12

On Thu, Dec 4, 2014 at 5:49 AM, Domenic Denicola notifications@github.com wrote:

Node is non-blocking and async first and foremost and uses javascript secondarily.

Asserting this does not make it true.

— Reply to this email directly or view it on GitHub https://github.com/node-forward/discussions/issues/9#issuecomment-65541138 .