WiseLibs / better-sqlite3

The fastest and simplest library for SQLite3 in Node.js.
MIT License
5.43k stars 394 forks source link

Replace prebuild+prebuild-install with prebuildify #655

Open fabiospampinato opened 3 years ago

fabiospampinato commented 3 years ago

Context: I'm trying to prune my dependency tree as much as possible out of security concerns.

This is the dependency tree NPM is giving me for better-sqlite3:

├─┬ better-sqlite3@7.3.1
  ├─┬ bindings@1.5.0
    └── file-uri-to-path@1.0.0
  ├─┬ prebuild-install@6.1.2
    ├── detect-libc@1.0.3
    ├── expand-template@2.0.3
    ├── github-from-package@0.0.0
    ├── minimist@1.2.5 deduped
    ├── mkdirp-classic@0.5.3
    ├── napi-build-utils@1.0.2
    ├─┬ node-abi@2.26.0
      └── semver@5.7.1
    ├── noop-logger@0.1.1
    ├─┬ npmlog@4.1.2
      ├─┬ are-we-there-yet@1.1.5
        ├── delegates@1.0.0
        └─┬ readable-stream@2.3.7
          ├── core-util-is@1.0.2 deduped
          ├── inherits@2.0.3 deduped
          ├── isarray@1.0.0
          ├── process-nextick-args@2.0.1 deduped
          ├── safe-buffer@5.1.2 deduped
          ├─┬ string_decoder@1.1.1
            └── safe-buffer@5.1.2 deduped
          └── util-deprecate@1.0.2 deduped
      ├── console-control-strings@1.1.0
      ├─┬ gauge@2.7.4
        ├── aproba@1.2.0
        ├── console-control-strings@1.1.0 deduped
        ├── has-unicode@2.0.1
        ├── object-assign@4.1.1 deduped
        ├── signal-exit@3.0.3 deduped
        ├─┬ string-width@1.0.2
          ├── code-point-at@1.1.0
          ├─┬ is-fullwidth-code-point@1.0.0
            └── number-is-nan@1.0.1
          └── strip-ansi@3.0.1 deduped
        ├─┬ strip-ansi@3.0.1
          └── ansi-regex@2.1.1
        └─┬ wide-align@1.1.3
          └─┬ string-width@2.1.1
            ├── is-fullwidth-code-point@2.0.0
            └─┬ strip-ansi@4.0.0
              └── ansi-regex@3.0.0
      └── set-blocking@2.0.0
    ├─┬ pump@3.0.0
      ├─┬ end-of-stream@1.4.4
        └── once@1.4.0 deduped
      └── once@1.4.0 deduped
    ├─┬ rc@1.2.8
      ├── deep-extend@0.6.0
      ├── ini@1.3.7 deduped
      ├── minimist@1.2.5 deduped
      └── strip-json-comments@2.0.1
    ├─┬ simple-get@3.1.0
      ├─┬ decompress-response@4.2.1
        └── mimic-response@2.1.0
      ├── once@1.4.0 deduped
      └── simple-concat@1.0.1
    ├─┬ tar-fs@2.1.1
      ├── chownr@1.1.4
      ├── mkdirp-classic@0.5.3 deduped
      ├── pump@3.0.0 deduped
      └─┬ tar-stream@2.2.0
        ├─┬ bl@4.1.0
          ├── buffer@5.6.0 deduped
          ├── inherits@2.0.4
          └─┬ readable-stream@3.6.0
            ├── inherits@2.0.4 deduped
            ├─┬ string_decoder@1.3.0
              └── safe-buffer@5.2.1
            └── util-deprecate@1.0.2 deduped
        ├── end-of-stream@1.4.4 deduped
        ├── fs-constants@1.0.0
        ├── inherits@2.0.3 deduped
        └─┬ readable-stream@3.6.0
          ├── inherits@2.0.3 deduped
          ├─┬ string_decoder@1.3.0
            └── safe-buffer@5.2.1
          └── util-deprecate@1.0.2 deduped
    └─┬ tunnel-agent@0.6.0
      └── safe-buffer@5.1.2 deduped
  └─┬ tar@6.1.0
    ├── chownr@2.0.0
    ├─┬ fs-minipass@2.1.0
      └── minipass@3.1.3 deduped
    ├─┬ minipass@3.1.3
      └── yallist@4.0.0
    ├─┬ minizlib@2.1.2
      ├── minipass@3.1.3 deduped
      └── yallist@4.0.0
    ├── mkdirp@1.0.4 deduped
    └── yallist@4.0.0

Interestingly this is the same tree without "prebuild-install":

├─┬ better-sqlite3@7.3.1
  ├─┬ bindings@1.5.0
    └── file-uri-to-path@1.0.0
  └─┬ tar@6.1.0
    ├── chownr@2.0.0
    ├─┬ fs-minipass@2.1.0
      └── minipass@3.1.3 deduped
    ├─┬ minipass@3.1.3
      └── yallist@4.0.0
    ├─┬ minizlib@2.1.2
      ├── minipass@3.1.3 deduped
      └── yallist@4.0.0
    ├── mkdirp@1.0.4 deduped
    └── yallist@4.0.0

Basically prebuild-install is installing way too much stuff and I'd like to see it removed as a dependency.

On prebuild-install's own readme it says the following:

Instead of prebuild paired with prebuild-install, we recommend prebuildify paired with node-gyp-build.

And in prebuildify's readme it says:

With prebuildify, all prebuilt binaries are shipped inside the package that is published to npm, which means there's no need for a separate download step like you find in prebuild.

Essentially it seems that a whole lot of code and complexity could be removed just by shipping the prebuild binaries with the package itself rather than requiring an extra download step for them.

For completeness this is the dependency tree I'm getting for prebuildify:

└─ prebuildify@4.1.2
   ├─ execspawn@1.0.1
   │  └─ util-extend@1.0.3
   ├─ pump@3.0.0
   │  ├─ end-of-stream@1.4.4
   │  │  └─ once@1.4.0
   │  └─ once@1.4.0
   │     └─ wrappy@1.0.2
   ├─ npm-run-path@3.1.0
   │  └─ path-key@3.1.1
   ├─ node-abi@2.30.0
   │  └─ semver@5.7.1
   ├─ mkdirp-classic@0.5.3
   ├─ minimist@1.2.5
   ├─ tar-fs@2.1.1
      ├─ chownr@1.1.4
      ├─ mkdirp-classic@0.5.3
      ├─ tar-stream@2.2.0
      │  ├─ bl@4.1.0
      │  │  ├─ inherits@2.0.4
      │  │  ├─ readable-stream@3.6.0
      │  │  └─ buffer@5.7.1
      │  │     ├─ ieee754@1.2.1
      │  │     └─ base64-js@1.5.1
      │  ├─ end-of-stream@1.4.4
      │  ├─ fs-constants@1.0.0
      │  ├─ inherits@2.0.4
      │  └─ readable-stream@3.6.0
      │     ├─ inherits@2.0.4
      │     ├─ util-deprecate@1.0.2
      │     └─ string_decoder@1.3.0
      │        └─ safe-buffer@5.2.1
      └─ pump@3.0.0

It looks like maybe the tar-fs package could be used instead of the "tar" one to trim down this further.

Potentially I'd be even happier just rebuilding sqlite locally even if that takes a bit longer to install, or with some other solution that manages to keep using prebuild binaries and trims down dependencies further.

I think this package is pretty popular on servers, where each extra dependency you need to install is a bit of a gamble, IMO not strictly necessary dependencies should be removed as much as possible.

fabiospampinato commented 3 years ago

@JoshuaWise I can look into submitting a PR for this if you like.

kstewart83 commented 3 years ago

I'm interested in developing sqlite solutions where my dev environment is windows and my target environment is Linux (AWS Lambda). For AWS lambda, you basically deploy by zipping up the folder including the code and node_modules folder. There are solutions with layers and other approaches, but it creates complexity with other parts of my build pipeline that I'd rather avoid if I can. I can follow the steps listed here for the sqlite3 package and it seems to work (not the starred solution, but the one that uses node-pre-gyp install). I tried following the same steps using better-sqlite but get the error:

Error: better-sqlite3 package.json is not node-pre-gyp ready:
node-pre-gyp ERR! stack package.json must declare these properties:
node-pre-gyp ERR! stack binary.module_name
node-pre-gyp ERR! stack binary.module_path
node-pre-gyp ERR! stack binary.host

Not being to familiar with node-gyp, would the solution above help support this use case of installing prebuilt binaries other than the current host environment into node_modules?

Prinzhorn commented 3 years ago

@fabiospampinato I'm curious if you have any experience with Electron and prebuildify? I've searched the electron-forge, electron-builder and electron-packager repos for "prebuildify" and didn't find satisfying results. My main question is: how do you not ship all those binaries with your Electron app? E.g. you want to exclude the Windows and macOS binaries from a Linux Electron and vice versa? The prebuildify docs say you use node-gyp-build to load the correct binary and run-time, but how to you only bundle the needed binding at package-time? This can easily account for 20MB of trash otherwise.

fabiospampinato commented 3 years ago

@Prinzhorn I mainly work on an Electron app actually, but I don't know too much about prebuildify, I only tried to use that once, in this fork of nsfw and I didn't quite enjoy the experience.

Regarding your question I think if you want to have a smaller installation footprint you'll have to manually exclude the artifacts for other platforms when bundling the app for each platform, you can do so with electron-builder, I'm not familiar with forge and packager but I'd imagine they have a similar system.

In general I'm not too familiar with using native dependencies in Node, I've found them to be a pain in the ass to work with. Maybe better options could be to use N-API o WASI, but I've no idea how those work.

Surely having to download a critical dependency at runtime for your app to work comes with its own downsides though.

JoshuaWise commented 2 years ago

The README for prebuildify states:

it is faster to download all prebuilt binaries for every platform when they are bundled than it is to download a single prebuilt binary as an install script.

However, I find this hard to believe when better-sqlite3 has 44 prebuilt binaries (and more are added every few months), each one being about 1MB. Users with slow broadband connections will be punished.

fabiospampinato commented 2 years ago

it is faster to download all prebuilt binaries for every platform when they are bundled than it is to download a single prebuilt binary as an install script.

I don't know what kind of logic they used to come up with that argument.

IMO this proposal isn't an improvement regarding speed of installation but regarding security, every dependency increases the likelihood that at least one of your dependencies will eventually ship malware, and that can be game over. 40+ binaries is indeed a lot, but IMO switching to the safer approach is still totally worth it for the security aspect alone.


Still it'd be nice if making the installation more secure didn't require a 40x slowdown.

Regarding the number of artifacts:

better-sqlite3-node-darwin-x64.tar.gz
better-sqlite3-node-linux-x64.tar.gz
better-sqlite3-node-linuxmusl-arm64.tar.gz
better-sqlite3-node-linuxmusl-x64.tar.gz
better-sqlite3-node-win32-x64.tar.gz
better-sqlite3-node-win32-ia32.tar.gz

Which would make downloading all of them a non-problem basically.


Optimizing the generation of artifacts can be a pain in the ass though, and maybe that's not doable for some reason, IMO there's a much simpler solution to the problem: let's just publish the *.node files, bypassing tar and gzip entirely, then I can write a drop-in replacement for prebuild-install that downloads the right file without importing a million dependencies, and then I can submit a PR for using that instead. The download size in the end should be the same because those files should be compressed at the network layer, if they aren't just an extra MB will be downloaded anyway.

Obviously non having any dependency that downloads anything at installation-time would be better from a security perspective, but if it's just a tiny dependency, and especially if I'm maintaining it, it's totally ok from my point of view.

Does that sound reasonable?

JoshuaWise commented 2 years ago

Obviously non having any dependency that downloads anything at installation-time would be better from a security perspective, but if it's just a tiny dependency, and especially if I'm maintaining it, it's totally ok from my point of view.

@fabiospampinato that does sound reasonable. Although, if we're bypassing tar/gzip, the .github/workflows/build.yml file will need to be changed to accomplish that. If you're able to figure that out as well, I would highly appreciate it.

maxmilton commented 2 years ago

An alternative approach to prebuilt binary distribution that may be worth looking at is how esbuild works. I believe what happens is esbuild leverages the package manager (npm, yarn, pnpm, etc.) optionalDependencies resolution to install a platform-specific package that contains the actual bin. A postinstall script in the main package then links up the platform-specific bin to the main package API. There are a bit more specific details but you get the general idea.

Not sure how well that kind of approach might work with this project but it's interesting for sure.


fabiospampinato commented 2 years ago

@JoshuaWise I've taken a look at the workflow but that's a can of worm I'd rather not open, I have 0 experience with prebuild and github workflows in general, it'd be great if you could take care of that. In any case I'd say that's a problem that can be tackled after the thing that downloads the .node file is written. I'll try to implement that sometime next month work permitting.

alex22cc commented 2 years ago

Hi there, I came to this post because I liked the approach of better-sqlite3 but I failed to install it on my HostEurope Plesk server because they do not allow the execution of

The environment there is just to install fixed packages and but not pre-gyp packages. Therefore I would like to encourage you to have a much lighter dependency tree as @fabiospampinato has described. I think this way my problem is solved, too. This would also increase the number of developers using better-sqlite3 which is really worth to be used more frequently. Alex

fabiospampinato commented 2 years ago

Little update: I'm not longer interested in this, as I think there's just a much simpler approach to the whole problem: just ship the 3 official prebuild binaries for mac/linux/windows (https://www.sqlite.org/download.html) and spawn those directly. It's like 2MB gzipped total with no need to tar/ungzip/prebuild/download anything.

maxmilton commented 2 years ago

One downside with that is only the Windows prebuilt binaries come with a 32 bit and 64 bit version, Linux and macOS are x86 only. Extra system packages may need to be installed for 32 bit compatibility on some systems.

fabiospampinato commented 2 years ago

I think that's somewhat of a problem but at the other end of the spectrum, like macOS can't even run 32bits executables anymore, but there are ARM versions for all those OSes.

0gust1 commented 1 year ago

We face the same kind of issue (we can build on various systems, but deploy on linux).

For our main DB we use prisma (because of the ORM and the typing), but for additional DBs we want to use better-sqlite3 (the heaviness of an ORM is not justified for our "peripheral DB" use case).

Prisma builds binaries for all platforms, and it can be choosed at build time, regardless of the current OS architecture, by setting the right ENV var before installing

e.g., in our "production build script", we have PRISMA_CLI_BINARY_TARGETS=debian-openssl-1.1.x npm ci, and it makes prisma to integrate the linux binary in the dependencies (even when building from another OS).

I found this quite handy and elegant. It looks like it may eventually help people building electron app too, and I wonder if it could be a valid inspiration.