Closed creationix closed 8 years ago
I think a first step would be to define an interface which all backend implementations should adhere to. One of the goals here is to avoid C/C++ addons so we don't need to worry about a public facing C or C++ API for addons initially.
Since it's a royal pain to try and get all JS engines to conform to a least-common-denominator interface, let's instead have independent implementations for each engine that all match some JS interface spec. Then modules written for one runtime will work for all so long as they don't use language features unique to a particular runtime (V8 and Chakra for example have most/all of ES6 while duktape is mostly ES5, but has lua style coroutines).
Once he have a common interface, we can start implementing the C parts for the various runtimes. I humbly suggest the I/O parts be directly designed to match libuv.
I think this will also be greatly simplified if we use things such as @mscdex's work to make the dns resolver be pure-js and not use c-ares https://github.com/nodejs/node/pull/1843, and the http-parser: https://github.com/nodejs/node/pull/1457
I think a first step would be to define an interface which all backend implementations should adhere to.
Sounds like @trevnorris's original API WG goals.
:-) I've had a long-standing goal to create an API for node that serves as a strict entry point into C++, which basically all code in lib/
would use. Alas, time constraints.
As I mentioned in the linked DNS PR, it is difficult to get even close to matching the performance of c-ares/libc, even when using node's C++ UDP bindings directly. That pretty much rules out any performance issues in js land, so the C++ layer would have to be improved (if possible) to be able to compete with c-ares and/or the system resolver.
Regarding http, I haven't compared benchmarks since @indutny incorporated the JS stream stuff that bypasses js land when doing http parsing, so I'm not sure how the pure js http parser fares anymore.
As far as DNS resolving, in luvit we have two paths. One is pure lua on top of libuv's UDP primitives and is used for advanced queries. For basic resolving domain names to ip addresses, we use libuv's getaddrname
and getaddrinfo
which uses the system library on a thread-pool I believe. We have had no performance issues with this. Both are pure script on top of what libuv provides natively which will be provided in the the C core.
Let's not get too tripped up on edge performance issues. The goal here isn't to win synthetic benchmarks with the vanilla flavor of the minimal core. We will have options where people can build different flavors of the core with various libraries included (like openssl, cares, http_parser, etc). If you're deploying a large enough system where these performance issues are actually a problem, then you don't mind compiling a little C code. But for most projects and development workflows, this is not critical.
In luvi, there are two main flavors known as "tiny" and "regular" with the biggest difference being that regular includes openssl and a couple lesser used C addons in it's core. For many cases, http servers don't need openssl since they are running behind a reverse proxy anyway that handles the TLS termination. Things like MD5, SHA1, etc can usually be handled just fine (and sometimes even faster) in pure script.
Wow, I really like this proposal. I was working on something similar recently:
It is a modular C stream implementation. Not sure how useful it is, but it could be a good enough interface for interactions between C addons.
@indutny I saw those. I've always said libuv should have an extension community where things are written in C and can be used by all runtimes that consume libuv. Those could be included as well in the core if they are tiny (which I expect) and in optional addons if not.
If they are only to be consumed by other C code and it doesn't make sense to expose them to JS, that's fine. It will still be useful for addons to core that can use them.
I started a new issue for designing the libuv -> js mapping interface that all implementations must adhere to. https://github.com/creationix/nucleus/issues/2
uv_link_t
by itself is very small. uv_ssl_t
is a bit bigger.
So I think we could say a nucleus implementation contains:
I think for part 4, we should follow the pattern in luvi. This means including minimal code for reading zip files. I have a modified version of miniz that I've bugfixed and added missing features that works great for this and is super tiny.
This will expose a bundle API that allows scripts to read in the virtual filesystem that can either be a zip file (standalone or appended to nicleus) or a folder on disk.
We will also have some minimal hooks that makes bootstrapping a require system in userland less painful. For example, it can look for a file bundle:deps/require.js
or something and auto-run it if it exists before running bundle:main.js
.
Do not wish to disappoint you but there is already something similar in works https://github.com/saghul/sjs
Have to say I am interested in this kind of projects, since I think those can replace LUA with JavaScript (that more people are common with) to be used for scripting their software. Also there is large library of nodejs compatible javscript modules, making it possible to be used from application itself would be a great plus.
My suggestion would be to to go for C++11/14 support, and not just plain C. Exposing API and enabling user to expose their classes into JavaScript is very useful. There is LuaBridge project done for LUA that enables you to expose your classes and objects to a LUA engine. Doing similar out-of-box solution would make integration with user code even easier.
Note that Lua and duktape are quite similar in design so similar patterns can be used.
If you go this or similar way "you have my axe" :)
Am also super into this.
Plain C makes for easier interop with whatever other language one might be interested in calling from, (e.g. rust).
@mmicko I'm not disappointed, I know about sjs and even linked to it in the parent conversation in the nodejs issue. From my initial browsing however, sjs is much higher-level and opinionated than this project is aiming to accomplish.
Glue to make applications.
@creationix We'll probably a good amount of process
, and some sort of module... bootstrapping at least. (Or maybe we just use ES modules?)
Is that what you meant by "glue"?
@creationix good to hear that
@dlmanning understand that C API is easiest to combine with other languages, just pointing that C++11/14 support would be quite welcome
@mmicko Also since we're fixing the interop level at the JS interface exposed by the C/C++ backend we don't need to standardize on a language/version. The duktape backend might be all C89 while the V8 backend will obviously have some C++ involved. The common glue layer can even have multiple implementations if needed as long as the JS interface matches the spec. This is why it's important to define the interface clearly.
Note: using just ES modules are quite incompatible to the current node ecosystem so we'd still have to have some module bootstrapping available for the module
module I think.
(& It would probably still have to be passed to scripts implicitly, like require
. ...So it would probably have to be apart of the nucleus, I think.)
I don't want the module system to be part of the core glue. All we need is some conventions for bootstrapping a module system on choice. I really don't want things like node's global process
in this layer.
For the curious, you can see how luvit accomplished this. Both process and require are userspace in modules.
@Fishrock123 I envision two parts.
The core API will provide things like loading files by path, scanning directories, getting cwd, getting environment variables, getting path to main binary.
It would also expose the JS runtime with API functions for compiling strings into code (with filename and ES goal type)
Would this not be enough? What APIs exactly would need to be provided for a module system to be implemented?
For luvit's require which is modeled after node's I basically needed:
@Fishrock123 I think the simplest way to expose the builtin C modules without depending on a module system is to have some global object (like global.NUCLEUS
) that exposes the various builtin modules. Userspace module systems could then expose a uniform interface where require('uv')
simple returns global.NUCLEUS.uv
, but require('some-other')
is handled by the custom loader.
You could even call it process.binding
:trollface:
@domenic As I told @Fishrock123 in IRC, I'd like to avoid any name clashes with anything existing in node so I don't have to worry about matching semantics. This layer needs to have as little opinion as possible.
Also, process.binding
will go away if this ever lands in core. And it will assuredly have a different shape.
@Fishrock123 I wrote up the beginnings of a README with the parts that are currently designed. This should help solidify the design goals a little.
@dlmanning see #3
@creationix : I am not as funny as I think I am...
@creationix It woulde nice if nucleus
would be available as an library
for C++ embedding. I have used jxcore
for this purpose: https://github.com/jxcore/jxcore/blob/master/doc/native/Embedding_Basics.md and quite liked it. But it is not supported anymore ;(
@drom I'm not sure there would be much in here apart from what's provided in the JS engines and the bindings. I'll try to make the various bindings independent enough that they could be used embedded in other projects.
Hi! I'm poking at something along the same lines over here. It builds and runs on linux (ubuntu trusty) and OSX thus far, and glues v8 to libuv & uv_link_t using gn
.
It currently leans on a hacked-up version of chromium's build/
dir, which I'm tearing apart to get to the salient bits. The idea is to get it running on windows, osx, and linux first, then rewrite the build
dir's gn stuff in a cleaner way to get to that end.
The experiment is thus:
require
) to js.
gclient
& gn
to pull in the minimal binding layer.process
) is accessed, or a node builtin module is required require('fs')
, short circuit the lookup to require('@nojs/node-<target>')
.npm install
working and bundle npm
with the project.My (handwave-y) plans are — and you'll each probably find something you like and something you dislike here:
Promise
. async
is coming.ReadableStream
spec later.gn
and gclient
, keep deps up to date with gclient sync
.mmap
in order to create callable executable code from JS (possibly only for core functionality, but maybe not.)In other words: I think this project and nojs are probably going to be walking along the same path for a bit, though it seems like eventually we'll have different goals. I'm happy to share the build code I've hacked together. Maybe making it easier to grab a compilable, working copy of libuv+v8 & friends will let a thousand Nodes bloom.
@chrisdickinson looks very cool! Though, you probably would like to use jit.js
instead of heap.js
, since the latter one is a JS VM Heap implementation...
@chrisdickinson thanks for the feedback. Indeed our goals are slightly different. Also I'll be starting with duktape and jerryscript as sample imeplementations of this interface as I abhor C++ and that steers me away from V8. Once I have things stable it would be awesome to use your code to make a V8 implementation.
Also the scope of this project seems to be a bit slimmer. I won't have any opinions at all regarding streams, promises, etc. I just want to provide a common base for tools to be built.
@indutny Ah indeed! I was thinking about repurposing this code to do the hop from JS to compiled code.
@creationix Cool — I wish you the best of luck! I'd definitely encourage checking out gn
as a metabuild tool, it's slightly opaque but is pretty slick after a bit of use. I'm collecting a list of possibly handy links on the process of gluing stuff together.
@chrisdickinson https://github.com/js-js/jit.js/blob/master/src/jit.cc#L56-L96 ;)
I am certainly of the opinion that @creationix's opinionlite approach is the way to go. Streams should definitely not be in the "core", way to many opinions in streams. even we have @creationix's min-streams and my pull-streams because we couldn't agree on one thing and they are incredibly simple!
I think a project like this is really a C project, it looks like it's about javascript but it's not. It's about finding a way for C libraries to easily plug into a thing, it seems to involve javascript, but would that even be necessary?
There are totally ligitimate reasons not to include certain C libraries (personally, I'd like be able to exclude openssl, and build in libsodium instead - This would be ideal for secure decentralization projects) clearly there is also different JS engines that target different use cases (jerryscript is low resource use vs v8 is performance)
I think that means that the particular C libraries used need to be lightly coupled, I just need to pull them in by editing a config (or package.json)
@drom's point about embedding as a library would be super valuabe too - that would make this easy to deploy as an android app - just write a java binding to it and then embed directly into the same process.
but @chrisdickinson I think you are right about FFI. It's too hard to write a node binding, if you could just call a C function from "javascript" then we are done. Is that what you are thinking here?
even if I have to put the args I am calling into a buffer, that is still easier than the current way to write node bindings.
I should also point out that you don't actually need a module system. If you can run one javascript file, then you can statically link the javascript. i.e. with browserify, or noderify (which is assembled from browserify parts to make node.js scripts start really fast)
Initial core API is documented in the README and I just prototyped a duktape version (minus libuv and zip reading) that you can see in action.
dofile
directly to manually load it's minimal libraries.See it in action https://asciinema.org/a/b0yk23l05yhrw9mlp0uqik6pp
@dominictarr while it's true you don't need a module system, I do love a workflow that doesn't have build steps. As I demonstrated in the asciicast, you can run apps directly out of the source tree while developing without needing to rebuild the final binary. If the JS needs to go through a build step it breaks this simple workflow.
Given that JS now has a module system in its specification, it would seem strange to not build it in, no?
@dlmanning sure, if you are using a javascript engine that implements modules, then you could have that. The engines that @creationix is talking about starting with jerry-script and duktape both implement ES5.1
@dominictarr sorry, I missed the bit about starting with JerryScript
@dominictarr:
but @chrisdickinson I think you are right about FFI. It's too hard to write a node binding, if you could just call a C function from "javascript" then we are done. Is that what you are thinking here?
Yep!
@dlmanning: Notably, the module system is only ~sorta implemented in stable V8's as well (flagged and, IIRC, incomplete.)
@chrisdickinson : sure, it's a work in progress, but it's in progress.
(Don't worry, I have no desire to turn this thread into another ES Modules debate)
One the side about import
. It's not possible to resolve a path at runtime. Which makes development of native modules a little more painful when you simply want to run:
$ NODE_DEBUG=1 ./node_g /path/to/my/module
and have it automatically pick up the Debug build of the binary. Setting up the application in this way, I'd assume there would be more than a few native modules written to extend the basic functionality.
@trevnorris : Seems like it would be good to provided a separate means of deliberately loading dynamically?
Good choice on splitting the module system into user-land. I agree with both @creationix here that having one is good for development and with @dominictarr that they aren't needed for production. Is main.js as an entry-point going to be configurable? I'd like to have a separate dev.js and prod.js so I can do both.
This is going to be amazing for transpile-to-js languages, you essentially get statically linked small(ish) binaries for free if you just choose JS as your target.
@matthewp luvi has an option to override the entry point, but it's tricky designing the CLI without resorting to environment variables that can cause security vulnerabilities.
That said, you can have a main.js that loads a real main of you choice based on some env or argument.
The basic goal of this project is to implement a tiny core runtime that contains libuv, javascript and some essential C libraries (like openssl) needed to re-implement node.js in userland as modules.
If possible this will be backend agnostic and allow multiple JS engines.
See also https://github.com/nodejs/node/issues/7098