uhop / node-re2

node.js bindings for RE2: fast, safe alternative to backtracking regular expression engines.
Other
495 stars 53 forks source link

Pre-building binaries for node-pre-gyp using Travis #66

Closed kobelb closed 4 years ago

kobelb commented 4 years ago

When trying to use node-re2 in Kibana, a number of developers ran into issues building the native modules. This PR switches from relying on node-gyp to node-pre-gyp so developers don't have to build the native modules.

Building portable Linux binaries is challenging given the diversity of distributions, the variety of available shared libraries, and the inability to statically link to glibc. The Linux builds are performed on CentOS Docker containers using the devtoolsets provided by Software Collections. This allows us to build using Travis CI while being fully in control of the operating system and build toolchain. The Node.js 10 build is done on CentOS 6 which creates a binary which is dynamically linked to glibc 2.12. Node.js 12 dropped support for glibc 2.12 and requires glibc 2.17. Therefore, the Node.js 12 and 14 builds are done on CentOS 7 which uses glibc 2.17.

The Windows and Darwin builds do not use Docker, and built using the operating system which Travis provides. I haven't run into any portability issues with the Windows and Darwin binaries.

This PR relies on cloud resources in AWS, GCP and Docker. There will be some changes required before it can be merged, which I'm more than happy to help out with. At a minimum, Docker will have to be used for the aforementioned reasons and the containers should be moved to no longer be under my personal account at https://hub.docker.com/u/kobelb. Travis CI supports deploying the native module archives to a number of providers. Currently, these archives are being deployed to AWS and then I manually copy them to GCP where they're consumed, but this can be changed to whatever cloud provider you'd prefer. These archives can even be deployed as a GitHub release, if you'd prefer to not host them AWS or GCP.

uhop commented 4 years ago

Closing:

  1. What problem does it solve? The tests would be nice.
  2. To my best knowledge, Kibana doesn't use node-re2.
  3. I don't want to include mechanism, which downloads executables on computers of unsuspected users and be responsible for that.
  4. The whole machinery looks heavy-weight size-wise. It looks like it involves some third-party services, does not feel secure, and taxes me as a maintainer to support it too.

Obviously I could be wrong, but I feel that distribution channels are a different beast and they should be tackled differently and independently from writing code. It is an orthogonal business, and it should be done separately from the main functionality. A wrapper maybe?

My personal position is to be compatible with node-gyp. If it works on your computer it should be able to build node-re2. I understand the burden to have a build environment set up, but it is the official tool, which admittedly can be improved by various techniques including the ones you are proposing.

If you have legitimate concerns about building and distributing native code modules, I suggest bringing it up with Microsoft (NPM) and Node developers. I am sure they want to improve the overall developer experience more than anybody and will be willing to entertain a generic solution.

kobelb commented 4 years ago

What problem does it solve? The tests would be nice.

It solves the problem of consumers of node-re2 being required to build the native modules themselves as opposed to using pre-built binaries. It's a way to solve https://github.com/uhop/node-re2/issues/18 which was originally opened because users have issues running node-gyp. Additional manual steps are required to install node modules which rely on node-gyp per https://github.com/nodejs/node-gyp#installation, and this removes the necessity to do so.

If additional testing would make you feel more comfortable, I'm willing to add whatever test coverage you think is necessary. For what it's worth, Travis is running the existing tests against the built version of the binary: https://github.com/uhop/node-re2/pull/66/files#diff-354f30a63fb0907d4ad57269548329e3R32

To my best knowledge, Kibana doesn't use node-re2.

You're correct that it doesn't at the moment. It did for about a day before enough developers ran into issues with the reliance on node-gyp that we reverted the change and I began investigating using node-pre-gyp.

I don't want to include mechanism, which downloads executables on computers of unsuspected users and be responsible for that.

node-re2 is a Node.js package which runs arbitrary code both at install time and runtime. I get that native binaries are more opaque than JavaScript, but the potential impact is the same. Additionally, there are quite a few npm packages which rely on node-pre-gyp: https://www.npmjs.com/browse/depended/node-pre-gyp. Just because other packages rely on this approach, doesn't necessarily mean it's a good solution. It does have precedent, though.

The whole machinery looks heavy-weight size-wise. It looks like it involves some third-party services, does not feel secure, and taxes me as a maintainer to support it too.

I attempted to keep the external dependencies to a minimum, and there are improvements can be made. However, I do agree that is increases the maintenance burden.

uhop commented 4 years ago

I am willing to entertain the idea of a separate wrapper package.

uhop commented 4 years ago

I am looking into that and have problems building the CentOS 6 image. I found this explanation in https://forums.centos.org/viewtopic.php?t=71663

devtoolset-6 is deprecated. Maybe you should try -7 or -8.

And more:

SCL packages do not and never have had the same lifespan as normal packages from the distro itself. The distro and its packages are supported for 10 years from initial release but SCLs are supported for 3 years, sometimes less. We don't "just deprecate things that people use", we deprecate things that are out of support and potentially dangerous (due to lack of maintenance).

I'll try to look for alternatives.

uhop commented 4 years ago

I think I solved the problems. See 1.14.0.

All binaries are kept as release assets: https://github.com/uhop/node-re2/releases/tag/1.14.0

Please verify that it works for you.

First of all: thank you for the idea and for the research you did on that. It sped up my development of that feature considerably.

Backgrounder

I wanted to switch from Travis-CI to Github Actions for some time now. I finally did it this long weekend. It solved several problems for me as a maintainer:

kobelb commented 4 years ago

Hey @uhop, thanks so much for all of your help with this! Based on my preliminary testing, your changes are working perfectly! I'm going to start on reintegrating this to Kibana, and I'll let you know if we find anything along the way.

uhop commented 4 years ago

Excellent. I will extract the builder/installer part in a separate project(s), so people can plug-and-play the solution. Building binary extensions for Node is a bitch and I hope it'll make the whole process easier.