Open dherman opened 8 years ago
/cc @KenanSulayman
This has a lot of overlap with needs in the greater Rust community. I'm just going to braindump some stuff real quick.
Rust also has a need to deploy binary versions of crates via Cargo to speed up builds. The most recent motivator for this is the creation of a "Rust Platform metapackage", which combines multiple commonly-used crates into a single bundle and makes them accessible seamlessly as though they were part of the standard library. To do this we would expect to be able to securely publish precompiled crates to crates.io. It doesn't look like this exact formulation will come to pass in any near-term timeframe, but there are other reasons for cargo to want this. More generally it would just be nice to accelerate builds by having binaries available in the most common configurations. There are a lot of concerns to a workable design, especially regarding security, and no particular movement on this presently, so it's not clear if Neon could leverage anything upstream for doing binary deployment, but this is a good use case to consider, and a potential motivator for at least syncing up on a compatible direction. It may not make sense for Neon to do its own thing here from scratch because it will need to solve approximately the same security problems as Cargo would, but maybe there are existing systems it could leverage (e.g. if node's publishing mechanisms already provide signature validation).
One significant problem with publishing binaries is dealing with all the potential hardware configurations users might have: describing that configuration; building that configuration; determining which configuration is compatible for any user's build. These are all pressing issues for Rust too, and there are no complete solutions, but we probably need to align on them. In Rust we care about this because: there are some compile-time options that affect the behavior of the entire crate dag and which libraries are compatible to link to (allocator and unwinder implementation, cpu features); selection of some such features may be incompatible with the available standard library and force it to be recompiled.
Some links on this topic (that are probably going to be difficult to digest without context):
-C target-feature
) will influence Cargo's notion of a "target configuration", and would affect how Neon needs to think of architecture compatibility.For Neon this may not be a huge concern - you may just settle on maximally-compatible conservative configurations, and punt on the complexities; but considering that Neon is trying to make JS go fast, ultimately it's probably going to want to e.g. turn on the most aggressive CPU features when it can.
For binary deployments Neon has concerns that are unique beyond Rust's; most obviously that Neon wants to use these binaries without acquiring the Rust toolchain. In the Rust world this would probably mean supporting binary deployments of "staticlibs" or "cdylibs" and publishing them to a location and with a security scheme that is easy to manage without cargo.
When looking for synergies with Cargo binary publication and Neon binary publication one also has to consider that Cargo only publishes binaries for crates registered with crates.io. It's not obvious to me that Neon libraries would want to do that, in which case there is less overlap with a Cargo solution, or which may influence a Cargo solution to be more flexible.
I can't speak to how binary publishing would interact with the node workflow.
The focus of your requirement here is dealing with node/V8 API/ABI compatibility, but as mentioned above do consider that "architecture" can mean something like "base target triple + cpu features + global runtime configuration".
In order to achieve ABI compatibility, as you've mentioned the obvious thing to do is to is create your own compatibility layer. Since the clients of your published binaries are going to need to target this compat layer without the Rust toolchain the obvious thing to do is write that layer in C/C++ and compile that locally, then link the Neon Rust bins to that. Assumes that you can tolerate having a local g++ dependency, but if not you could consider publishing those bins too for all possible versions of node.
I don't think there are Rust-specific considerations here, except that if you want to publish bins of Neon's compat layer that could influence the joint binary publication design.
The most Robust solution will definitely have fallback where there is no compatible architecture for published binaries, or where the user wants to customize code generation (turning on aggressive CPU features). Because source compilation is the easiest to implement, even though it is least desirable, I'd suggest pursuing it first, and thus supporting fallback naturally when you do get binary deployment.
The way to do this is probably to use rustup. You might do some simple detection of whether rustup/rustc already exists on the system, and if not install and configure it, ask permission to do so, etc. You would ideally want to install it to a shared location so that all neon-enabled projects would share it; you might even just say "you're about to install Rust", and then install it normally. As far as installation goes it should be just a matter of detecting the appropriate target triple, coming up with the right URL of rust-init, and running it with the desired options.
Make Neon always publish asmjs/wasm versions of Neon libraries. The downsides here are that you need to have emscripten installed, and more than that the correct emscripten; the way to automatically get the correct emscripten to pair with Rust is not clear yet. Rust doesn't support this yet but is very close. The Rust story for publishing asmjs/wasm as libraries is also not done, there are some major bugs, etc. asmjs/wasm can't access lots of useful I/O features, at least not yet.
node-pre-gyp might be helpful.
I feel like it might be better though, at least in the shorter term, to just make an npm module that installs Rust, if it's not already present. You can use the install
lifecycle event in npm to trigger a script which checks if Rust is present and installs it, if necessary. For an example of how that lifecycle event works, see here and here.
Great conversation :-) This is definitely a problem for big shops. However the approach to install rust from npm is non starter. We do not have access to the internet from the build machines. I wish npm would have limited itself to install actual node packages instead of trying to take over system level issues.
For precompiling I suggest you take a look at https://github.com/mafintosh/prebuild
As for V8 API stability Node core is working on that here: https://github.com/nodejs/abi-stable-node
Now that https://github.com/rust-lang/rust/pull/36339 has landed, how hard would it be to build/deploy Neon-based libraries to wasm?
@brson?
As far as I know, wasm only allows passing byte arrays across the boundaries. Neon passes object references in both directions. This is more powerful, but also has increased risk.
In my opinion, wasm is tangential to neon.
So is there a good solution these days?
Parcel makes it really easy to integrate rust wasm with node, but manually doing it isn't too bad.
Better distribution for Neon is still an open question. We aren't currently distributing any neon projects as libraries--only as part of an application. We utilize Docker multi-stage builds to eliminate the rust build tooling in the final image.
@dherman @brson @Yoric @kjvalencik
I've come to conclude that the only sensible way forward is WebAssembly. Recent versions of Node support it natively, out of the box.
Specifically, as a real-world example, we have two different kind of production systems running on CoreOS and FreeBSD. When we tried shipping neon modules to the FreeBSD systems we were forced to maintain two binary distributions for either platform.
Most functionality that depended on struct inter-op between Node and Neon can be easily replicated with wasm_bindgen
:
#[wasm_bindgen]
pub struct Profile {
id: u64,
tags: Vec<u64>
}
#[wasm_bindgen]
#[no_mangle]
pub extern fn foo () -> Profile {
Profile { id: 1u64, tags: Vec::new() }
}
When foo
is called a reference to the struct inside the WebAssembly module is passed to Node-land. It can be freed using .free() from the Javascript side.
By leveraging the architecture-independent nature of WebAssembly we were able to get back to a single distribution again. Performance over native dropped significantly (10% to 20%), but frankly, our latency critical systems are fully written in Rust anyway.
I think Neon has a definitive use-case for environments where artefacts are distributed on-premise within self-contained deployments.
Going with WebAssembly and providing a solid foundation and toolchain for future “near native” Node modules that can't be solved with wasm_bindgen alone, or at least not elegantly, is the only way we can get both wide-spread use of Neon (because those wasm modules can be distributed via npm) and Rust, because some developers will want to understand how these modules work and learn Rust on the way.
FWIW, we ran into this problem with a Rust cryptography library that we built bindings for using Neon. We ended up making a pretty decent workflow where we compile everything via Neon and use node-pre-gyp
to both create the tar file with the expected architecture/node version name and as well as auto download the proper binary when consumers run npm install
. node-pre-gyp
allows a fallback to compile locally if it can't find the supported architecture, but we just ignore that since it's unlikely that the consumer has the Rust build toolchain locally. We hooked all of this up to TravisCI and have it build for different architectures and Node versions on merge and then publish the code to NPM and push the Neon binary artifacts to GitHub releases when we push a tag to the repo. If interested, the repo is here.
While I agree that using WebAssembly within Node certainly makes all of this easier, the performance degradation compared to Neon (at least for us in a cryptography library) made it so it was definitely worthwhile to have a more complicated build/dependency system in place. I can only assume that the performance of WebAssembly in Node will continue to improve, but at least for our use case it seems like it's a long way off from being able to compete with Neon.
Almost 1.5 years later, I have stumbled upon this issue while trying to solve a deployment related issue with native modules and wasm.
WASM is truly impressive, however the native modules overperform them by a long run. I have been testing the wasm and neon with nodejs, specially with the rust code from fib(46) test where you don't have neon and wasm.
Neon + Node 13: 4.6s Wasm + Node 13 mjs: 13.5s
Though, WASM loses in speed but wins in deployment.
I don't think the performance of native modules can be debated in any way. You'll also loose any freedom outside of "logic only".
One question should be: if you're using Node, are the gains by native over WebAssembly significant enough to be the bottleneck for everyday use-cases? This is including ffi overhead, init & teardown, ...
The obvious solution to this problem is providing binaries for each architecture and using a prefix for the hosting -- like Github releases.
Another year later and I need to publish the binding as an npm package. @ernieturner 's example is truly impressive and works great. But for those who do not want a long publish.js
script and get publish done without much thought, I have a rather small example. All you need to touch is package.json
and release.yml
. Binaries are stored at Github releases page instead of s3.
The actual package.json, publish.yml, npm package and the release page.
The whole process is to:
node-pre-gyp
to package and upload your binaries(index.node) to Github releases. So others can pull the right binary from this place.npx tsc
and publish the npm package without binaries via npm publish
.
Add "install": node-pre-gyp install
to scripts
in package.json
so users will trigger node-pre-gyp
to pull the right binary after npm i your-package
.Add those fields to your package.json
and fire npm i
.
{
"main": "dist/index.js",
"scripts": {
"build": "cargo-cp-artifact -nc index.node -- cargo build --message-format=json-render-diagnostics",
"build-debug": "npm run build --",
"build-release": "npm run build -- --release",
"release-native": "npm run build-release && rm -rf native && mkdir native && mv ./index.node ./native/index.node",
"release-js": "npx tsc",
"test": "npm run release-native && dev=true node --loader ts-node/esm --experimental-vm-modules node_modules/jest/bin/jest.js --runInBand"
},
"os": [
"darwin",
"linux",
"win32"
],
"cpu": [
"x64"
],
"dependencies": {
"@mapbox/node-pre-gyp": "^1.0.8"
},
"binary": {
"module_name": "pravega",
"module_path": "./native",
"host": "https://github.com/thekingofcity/pravega-client-rust/releases/download/",
"package_name": "{module_name}-v{version}-{node_abi}-{platform}-{arch}-{libc}.tar.gz",
"remote_path": "v{version}"
}
}
For tests, simply run npm run build-debug
and npm run test
.
For local install, simply run npm run release-native
, npm run release-js
, and npm pack
. And you can check the tarball to have both dist js files and index.node addon. But actual npm publish
should ignore native
folder so no binary is published.
For npm publish, see the following Github Actions workflow.
tsconfig.json
to compile ts files to dist
folder../native/index.node
and "module_path": "./native",
tells node-pre-gpy
to package and upload this addon."host"
with your own repo path."remote_path": "v{version}"
indicates a v
will appear before version. Our tag uses v
as the prefix.And in publish.yml
that triggered by tag like v0.4.0
:
name: package and publish to npm
on:
push:
tags:
- '*'
nodejs-npm:
name: nodejs-npm
runs-on: ubuntu-latest
# Prevent a situation where native build fails and an npm package is uploaded.
needs: [nodejs-github-native]
steps:
- uses: actions/checkout@v2
with:
ref: ${{ github.event.release.tag_name }}
- name: Set release version
# Set release version in all three os, the commented run should suffice for linux and mac.
run: python3 -c "import os; tag = os.environ['GITHUB_REF'].split('/')[-1]; f = open(os.environ['GITHUB_ENV'], 'a'); f.write('RELEASE_VERSION='+tag); f.close();"
# run: echo "RELEASE_VERSION=${GITHUB_REF#refs/*/}" >> $GITHUB_ENV
- uses: actions/setup-node@v2
with:
node-version: ${{ matrix.node_version }}
# Url is important! This makes NODE_AUTH_TOKEN accessible to npm publish.
registry-url: 'https://registry.npmjs.org'
- name: Install modules
working-directory: ./nodejs
run: npm i
- name: Build js
working-directory: ./nodejs
run: npm run release-js
- name: Tweak package.json
working-directory: ./nodejs
# This will update the package version to tag version and
# add an install script in package.json so users who `npm i` this package
# will trigger the node-pre-gyp to pull the os and arch specific binary.
run: python3 -c "import os; import json; p = json.load(open('package.json')); p['scripts']['install'] = 'node-pre-gyp install'; p['version'] = os.environ['RELEASE_VERSION']; json.dump(p, open('package.json', 'w'), indent=2, ensure_ascii=False);"
- name: Publish to npm
working-directory: ./nodejs
# `--access public` is used to publish to my account's scope.
run: npm publish --access public
env:
NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}
nodejs-github-native:
name: nodejs-${{ matrix.node_version }}-${{ matrix.system.target }}-${{ matrix.system.os }}
runs-on: ${{ matrix.system.os }}
strategy:
fail-fast: false
matrix:
node_version:
- 12
- 14
- 16
system:
- os: macos-11
target: x86_64-apple-darwin
- os: ubuntu-20.04
target: x86_64-unknown-linux-gnu
- os: windows-2022
target: x86_64-pc-windows-msvc
# Would like to have aarch64 support, but actions does not provide these yet.
# https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners
steps:
- uses: actions/checkout@v2
with:
ref: ${{ github.event.release.tag_name }}
- name: Set release version
# Set release version in all three os, the commented run should suffice for linux and mac.
run: python3 -c "import os; tag = os.environ['GITHUB_REF'].split('/')[-1]; f = open(os.environ['GITHUB_ENV'], 'a'); f.write('RELEASE_VERSION='+tag); f.close();"
# run: echo "RELEASE_VERSION=${GITHUB_REF#refs/*/}" >> $GITHUB_ENV
- uses: actions/setup-node@v2
with:
node-version: ${{ matrix.node_version }}
registry-url: 'https://registry.npmjs.org'
- name: Install modules
working-directory: ./nodejs
run: npm i
- name: Tweak package.json
working-directory: ./nodejs
# This will update the package version to tag version. So artifacts uploaded to Github release will be named correctly.
run: python3 -c "import os; import json; p = json.load(open('package.json')); p['version'] = os.environ['RELEASE_VERSION']; json.dump(p, open('package.json', 'w'), indent=2, ensure_ascii=False);"
- uses: actions-rs/toolchain@v1
with:
profile: minimal
toolchain: stable
target: ${{ matrix.system.target }}
override: true
- name: Build native
working-directory: ./nodejs
run: npm run release-native
- name: Pacakge the asset
working-directory: ./nodejs
# This will make a node-pre-gyp package.
run: npx node-pre-gyp package
- name: Upload to Github releases
working-directory: ./nodejs
# Use bash, even on Windows to make find available
shell: bash
# A release need to be created before upload
run: gh release upload ${{ env.RELEASE_VERSION }} "$(find ./build -name *.tar.gz)" --clobber
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
NPM_TOKEN
to your repository's secret.node-pre-gyp
and no node-gyp
, so no binding.gyp
is required. The build process is handled by npm run release-native
.@thekingofcity Thank you so much for sharing this complete example!
Another technique is to publish platform specific versions of your module. E.g., my-module-linux64
, my-module-windows64
, etc. These become optionalDependencies
and a post-install script installs the correct one.
There's also a RFC to codify this pattern in NPM itself. https:// github.com/ npm /rfcs/pull/519
There's also a RFC to codify this pattern in NPM itself. https:// github.com/ npm /rfcs/pull/519
Would love to see official support from the NPM itself. The example shows a pretty simple way of using and distributing binaries.
PS: Why does the example in the RFC still use a preinstall script? "preinstall": "node-gyp rebuild"
Isn't all the job done by npm?
No, I don't think npm
will automatically detect node-gyp
. AFAIK, it needs to be in a script.
Hi, new to this discussion... regarding "Option 1" (install-time compilation): the issue mentions long "npm install" times as the Rust toolchain is installed, but presumably that could be mitigated by:
node-gyp
does now, and the cost is acceptable? Doesn't that rely on having a C toolchain (granted, more lightweight than Rust, but...)For CI pipelines, couldn't this be mitigated by an appropriate Docker image for the CI runner that includes the Rust toolchain?
Most CI pipelines should have some kind of cache control. For GitHub Actions, you may find this action helpful :) It's also easy to find some tutorials showing how to use them for rust projects. How I speeded up my Rust builds on GitHub ~30 times
Deploying an app with Neon is fine, but deploying a lib will be harder until we can make it relatively seamless for downstream consumers of the lib (including transitive dependents) to be completely oblivious to the dependency on Neon and Rust.
Broadly, there are two possible approaches: 1) automate the process of installing Rust so downstream consumers can silently build their Neon dependencies on the fly, or 2) streamline the process of shipping precompiled binaries of a Neon lib.
Option 1) is attractive in theory, since it means neither library authors nor library consumers have to do any work. But I'm not really sure how to make it work, and even if we do, it could generate excruciatingly long
npm install
wait times as Rust is downloaded and installed and then as the Neon dependencies are compiled.Option 2) seems more achievable but still has a number of challenges. I'll spell out some constraints and ideas I can think of, but I'm really interested in getting ideas and feedback.
Constraint: Pushbutton precompilation.
It should be possible to generate cross-compile binaries for multiple platforms with a single command -- maybe this can be achieved with prepublish hooks alone, or maybe we should hijack the workflow with a
neon publish
command.(This is going to depend on the state of Rust cross-compilation. Would be great to get @brson's or @alexcrichton's input here.)
Constraint: Deploy just one binary per architecture.
The Node and V8 ABIs are not stable, which means that a naive design would require library authors to deploy a combinatorial explosion of (# of supported Node versions) × (# of supported architectures) precompiled binaries. But if we can build an ABI-stable abstraction layer between Neon and Node, then it should be possible to precompile once per architecture in a way that should be stable across multiple Node versions. (This is a little tricky since Node changes all sorts of things, like the ABI of a module's metadata. But with a little courage and a lot of caffeine...)
It's obviously impossible to predict all future changes to Node and V8, but since Neon attempts to mostly hew to core Node and JS semantics, which are forced to retain compatibility with the world's JS content, it should be a relatively safe bet that we can create a stable API that surfaces that semantics, and only the ways in which Node and V8 surface those API's will be what changes.
BTW, I don't think NAN really helps here, since it only attempts API compatibility but not ABI stability.
(I believe @carllerche has been through this kind of experience before -- would love his advice!)
Constraint: Make it possible to fall back to compilation.
It seems like it should still be good to allow downstream consumers to compile a Neon library from scratch, so that lib authors don't have to precompile for all supported architectures.
Constraint: Make it possible to customize fallback behavior.
There will be a couple common cases of what people want to do when downstream consumers don't have a precompiled binary for their architecture. One is to provide a custom error message, another is to provide an alternative pure-JS implementation, and another one is to allow the application to branch to completely different behavior (for example, if the native library is an optional plugin).
That's what I've got so far. Ideas? More constraints?