uhop / node-re2

node.js bindings for RE2: fast, safe alternative to backtracking regular expression engines.
Other
479 stars 53 forks source link

Ship prebuilt binaries using optionalDependencies instead of install script #207

Closed panga closed 2 months ago

panga commented 3 months ago

Some strict CI environments doesn't allow execute post-install scripts, this causes RE2 package to never download or build the native addon. Other restrictions include internet access where external requests other than internal NPM proxy are not allowed.

This is done as a Node.js recommended security practice to prevent supply-chain attacks.

The solution is to use optionalDependencies and distribute it directly from NPM, no more scripts/external requests required.

There're some example rust bindings projects that use this method successfully: https://github.com/Brooooooklyn/snappy https://github.com/napi-rs/node-rs

Other references: https://sentry.engineering/blog/publishing-binaries-on-npm https://github.com/evanw/esbuild/pull/1621

uhop commented 2 months ago

One problem with this approach is that it doesn't distinguish between musl-based and glibc-based Linux distributions. Unfortunately, we have to make this distinction. See for example the latest release: 1.20.10. Note especially linux vs. linux-musl.

TL;DR: Node produces official Docker images based on either Alpine Linux or Debian Linux. The former is a minimal image often used as a base for Node-based servers. The latter is a full Linux distribution, mostly used for development.

Alpine Linux uses musl. Debian uses glibc. They are incompatible. An extension compiled against one will not load if Node was compiled against the other. For example, I use Ubuntu (based on Debian) as my work machine. This means that my system uses glibc. But at work we have servers based on Alpine Linux, which uses musl. I don't see how we can avoid having two different versions of Linux.

BTW, disabling post-install breaks the official way of distributing binary plugins — they are all compiled during post-install. The same goes for node-re2: if an "official" precompiled binary isn't found, it will be compiled.

For security reasons, node-re2 supports private mirrors for precompiled artefacts. See the wiki for more details.

panga commented 2 months ago

@uhop the references packages provided contains musl builds, it can be supported along with glibc. Please ignore Sentry blog, it is incomplete and just a starting point. Since you're building a Rust binary, it can use napi-rs framework. With a few lines in package.json you can get it covered.

Example of snappy: https://github.com/Brooooooklyn/snappy/blob/main/package.json#L33

Matrix of builds:

|                  | node12 | node14 | node16 | node18 |
| ---------------- | ------ | ------ | ------ | ------ |
| Windows x64      | ✓      | ✓      | ✓      | ✓      |
| Windows x32      | ✓      | ✓      | ✓      | ✓      |
| Windows arm64    | ✓      | ✓      | ✓      | ✓      |
| macOS x64        | ✓      | ✓      | ✓      | ✓      |
| macOS arm64      | ✓      | ✓      | ✓      | ✓      |
| Linux x64 gnu    | ✓      | ✓      | ✓      | ✓      |
| Linux x64 musl   | ✓      | ✓      | ✓      | ✓      |
| Linux arm gnu    | ✓      | ✓      | ✓      | ✓      |
| Linux arm64 gnu  | ✓      | ✓      | ✓      | ✓      |
| Linux arm64 musl | ✓      | ✓      | ✓      | ✓      |
| Android arm64    | ✓      | ✓      | ✓      | ✓      |
| Android armv7    | ✓      | ✓      | ✓      | ✓      |
| FreeBSD x64      | ✓      | ✓      | ✓      | ✓      |

disabling post-install breaks the official way of distributing binary plugins — they are all compiled during post-install

It is not the only official way, it used to be. Over the years NPM ecosystem evolved to provide pre-built binaries and support for multiple platforms and libc implementations. The build can still be used as a fallback method in case the target platform doesn't have the pre-built binary.

For security reasons, node-re2 supports private mirrors for precompiled artefacts.

This alternative to not build and download from some cache still relies on scripts, it is not really approved by my company security standards. In addition, the management of such snowflake is not something that users are willing to maintain and they expect out-of-the-box solution works.

I understand this is an open source project and need support for feature requests like this. If you're open to contributions please let me know.

uhop commented 2 months ago

Thank you for the write up. Let's get down to brass tacks.

Since you're building a Rust binary, it can use napi-rs framework.

I am not building a Rust binary.

It is not the only official way, it used to be.

Could you point me to the relevant documents? Let me start you up:

Which one talks about it?

Just to be complete, this is the documentation for optionalDependencies:

I do not understand how it chooses the right binary. Does it try to install all available optional packages and the "wrong" ones somehow fail? How does it know it is the right one?

I can specify OS and CPU in package.json:

If you followed the links, you'll see that there is no way to encode musl vs. glibc difference.

Did I missed anything? I am sure I did. I am looking forward to learn more on that topic.