evanw / esbuild

An extremely fast bundler for the web
https://esbuild.github.io/
MIT License
37.95k stars 1.14k forks source link

esbuild seems to be broken when it comes to optional modules. #3173

Closed rubythulhu closed 1 year ago

rubythulhu commented 1 year ago

See mscdex/ssh2#1310 for more info.

I am not sure what package causes this error, I am posting this on multiple projects that are involved here. It could be ssh2, it could be esbuild, and it could be sst.

We have some AWS lambdas which are deployed via sst and built by esbuild. These lambdas have ssh2-sftp-client as a dependency, which in turn has ssh2 as a dependency. According to mscdex/ssh2#1310, this module SHOULD be optional, and indeed, the ssh2 package basically does try { require(binary_module) } catch {}.

esbuild seems to fail in this case, because the require() failed, even thought hat require() is caught by the catch{} block. We end up with a Lambda that fails RELIABLY when executed because this binary module could not be loaded, and this "optional" module does not exist in the build bundle, and the esbuild-compiled version of this function pukes on its own shoes because it cannot find this module (which is quite clearly optional).

hyrious commented 1 year ago

According to the stack trace your posted here, the actual error message is:

Error: Cannot find module 'ssh2'

..which surely comes from ssh2-sftp-client where it has code require('ssh2'). So this issue is about why it cannot find the ssh2 package at runtime (did you externalized it?) instead of whether or not the .node files are bundled/copied to the output.

rubythulhu commented 1 year ago

The ssh2 package is there, and the javascript part of its source is included in the source bundle. That package does a try { require(binary) } catch { /* do nothing */ } on the underlying binary module. This call should NEVER fail, because it swallows EVERY error.

Yet, at run-time? this fails. We did an upgrade of our software, which caused the binary modules to not be included in software (due to the above-described bug), and the lack of this binary module starts failing at runtime.

I understand that the error message says "can't load ssh2". But I can look at this package contents when it's failing, and a compiled-via-esbuild version of ssh2 exists in the bundle. Meanwhile, if i include the (confirmed via source to be optional) c++ dependency? function works. If my build process leaves it out? it fails reliably throwing an error message that it cannot load a binary module which doesn't exist.

Yes, we have tried external, that was one of the first things we tried.

There are two situations we can reliably get to happen:

I would absolutely appreciate suggested solutions, because everything i've tried is either "doesn't work", or "works, technically, but forces a redeploy every time". Nothing I have tried results in an answer that doesn't fit into the above two buckets.

hyrious commented 1 year ago

Thank your for providing more detailed cases. So basically you mean when the .node file is placed at the right location, then code magically runs well without throwing that module not found error. Without that file, the try-catch block seems not working and throws an error of not finding the whole package.

In that case, I guess this can be a weird behavior of the runtime -- AWS Lambda. I learned before where AWS did hack something in their runtime.

As a workaround, here's another choices to do The only thing that seems to "work" but could be saner:

rubythulhu commented 1 year ago

I'd rather not have the binary module in the first place, as they cause non-deterministic builds, meaning that every lambda that includes a binary module needs to be re-deployed every single time sst deploy, sst dev, or sst diff runs, because the shasum of the build changes.

We currently use the esbuild-native-node-modules-plugin package, which rewrites these to load properly. This brought the non-deterministic build problem back, unfortunately. Previously, we had used a post-build hook which searched the current folder for .node files, and if one was found, overwrite it with the already-compiled version in node_modules (if it existed), which because our CI preserves/restores node_modules between runs, and developers only re-run yarn install when an incoming merge modifies package.json, that solution mitigated this issue most of the time.

The yarn patch method seems fragile, and the esbuild plugin method seems like a lot more work (i'm actually somewhat confused as to how esbuild-native-node-modules-plugin even works after reading the source, it doesn't seem to control re-compiles, but i can md5sum the original module in <project-root>/node_modules against the esbuild version, different shas.

I can try the yarn patch or plugin options, but is there a way perhaps to tell esbuild to just use the version in <project-root>/node_modules and NOT recompile it for every build? That seems like the easiest/cleanest solution from an end-user perspective. I don't want the module to be rebuilt if the original source hasn't changed. We can just use the same one we used last time that already exists and is preserved between runs. The binary module itself isn't the problem. The fact that it gets re-built for each individual package IS a problem.

Binary modules themselves do not cause problems. They load fine when we ship them with our packages. Non-deterministic builds from esbuild DO cause problems -- each individual deployment that uses the binary module ends up changing its sha due to non-deterministic builds. Given, this is because node-gyp doesn't have consistent/deterministic source-file ordering, but the easy fix is "use the one we already have when applicable, but rebuild if needed".

hyrious commented 1 year ago

The compile process is not controlled by esbuild, instead it is the install script run by npm. There's a flag --ignore-scripts which tells npm to not run any package's install scripts. You may try this out, but it disables other packages too like the one in esbuild, which may cause other issues.

rubythulhu commented 1 year ago

There's a flag --ignore-scripts which tells npm to not run any package's install scripts.

Yeah, this is a non-starter because:

  1. as you mentioned, this applies globally, and no package manager allows this to be a per-dependency flag
  2. this would still fail at runtime -- if the module isn't there, the lambda dies. We could accomplish the same thing by just not using the native modules plugin and not using the esbuild install flag -- this causes the binary module to not be included, as if it were never re-built in the first place.

How does esbuild actually manage this, then? I was under the impression that esbuild compiles together all javascript files, then has some process afterwards. At what point does esbuild cause these modules to be re-installed in the local package, rather than using pre-existing dependencies? Does it re-install ALL node dependencies, whether binary or not, in a temporary place, and then use that to compile the final esbuild-built .js module? If so, why can't it use the pre-existing version in node modules? What causes this to rebuild every time?

Our CI builds all packages only moments before it asks esbuild to build esbuild bundles, without making any changes to source files. Then, esbuild re-builds every binary module for each individual esbuild package. I don't understand why anything needs to be re-npm installed during the esbuild run, especially when the immediately prior step in CI is yarn install.

hyrious commented 1 year ago

At what point does esbuild cause these modules to be re-installed in the local package, rather than using pre-existing dependencies?

AFAIK, there's no such process in esbuild itself.

If so, why can't it use the pre-existing version in node modules? What causes this to rebuild every time?

I don't know either, you have to debug this yourself. For example, if you mean your CI does run the install script everytime, maybe you need some cache strategy and custom install logic with --ignore-scripts and run necessary scripts manually.

Another way of disabling specific package's install script can be using the npm: prefix to replace a package with another but keeping the same folder name, i.e.

"overrides": {
  "ssh2-sftp-client": {
    "ssh2": "npm:@yourscope/ssh2-fork@version"
  }
}

Note: "overrides" is npm specific feature, yarn see this.

rubythulhu commented 1 year ago

AFAIK, there's no such process in esbuild itself.

Fascinating! Perhaps SST is causing this? I'll see if they have any idea.

rubythulhu commented 1 year ago

Thank you for your insight on all of this. This is super frustrating because it adds extra time to every developer's yarn dev process, and to every deploy to every environment, because any service using ssh2 must be redeployed every time, even if the code hasn't changed in months.

Honestly, node-gyp is the real problem here. Binary modules get compiled non-deterministically, and it seems like its really just because it generates a Makefile, which for a random binary looks like SomeBinary: Dep1 Dep2 Dep3 Dep4 ... and every time node-gyp runs the order of Deps changes, so the modules get compiled together in a different order?

I dunno. this has been a really deep rabbit hole. Thank you again.

evanw commented 1 year ago

I'm closing this issue since it sounds like this isn't a problem with esbuild itself.

rubythulhu commented 1 year ago

@evanw i am not so sure that it is not a problem with esbuild; or at least, I do not think the previously given explanation of "i guess the lambda runtime is doing it" is correct.

If i create a lambda with just this:

let x;
try { x = require("./foo"); console.log("missing module require ok?") } 
catch(e) { console.log("caught error", e); }
console.log("setup complete");

module.exports.handler = async(event) => {
    const response = {
        statusCode: 200,
        body: JSON.stringify('Hello from Lambda!'),
    };
    return response;
};

this lambda works, without failing:

Response
{
  "statusCode": 200,
  "body": "\"Hello from Lambda!\""
}

Function Logs
2023-06-27T22:37:44.324Z    undefined   INFO    caught error Error: Cannot find module './foo'
Require stack:
- /var/task/index.js
- /var/runtime/index.mjs
    at Function.Module._resolveFilename (node:internal/modules/cjs/loader:1026:15)
    at Function.Module._load (node:internal/modules/cjs/loader:871:27)
    at Module.require (node:internal/modules/cjs/loader:1098:19)
    at require (node:internal/modules/cjs/helpers:108:18)
    at Object.<anonymous> (/var/task/index.js:2:11)
    at Module._compile (node:internal/modules/cjs/loader:1196:14)
    at Object.Module._extensions..js (node:internal/modules/cjs/loader:1250:10)
    at Module.load (node:internal/modules/cjs/loader:1074:32)
    at Function.Module._load (node:internal/modules/cjs/loader:909:12)
    at Module.require (node:internal/modules/cjs/loader:1098:19) {
  code: 'MODULE_NOT_FOUND',
  requireStack: [ '/var/task/index.js', '/var/runtime/index.mjs' ]
}
2023-06-27T22:37:44.324Z    undefined   INFO    setup complete
START RequestId: 767dac5b-f54b-419a-aa4e-f4ea2cd13521 Version: $LATEST
END RequestId: 767dac5b-f54b-419a-aa4e-f4ea2cd13521
REPORT RequestId: 767dac5b-f54b-419a-aa4e-f4ea2cd13521  Duration: 10.51 ms  Billed Duration: 11 ms  Memory Size: 128 MB Max Memory Used: 58 MB  Init Duration: 146.56 ms

Request ID
767dac5b-f54b-419a-aa4e-f4ea2cd13521

I have, however, also verified that the lambdas we have that ARE failing retain the try{ require } catch {} logic.

evanw commented 1 year ago

Sorry, I don't know anything about lambda. I thought it sounded like an external problem from the above thread. If you provide a description of the underlying issue and a self-contained way to reproduce it without lambda involved (e.g. just an esbuild command that you run on a file), I can take a look.

xer0x commented 2 weeks ago

This has something to do with how our esbuild is configured. On stackoverflow the recommendation is to set the {"loader": { ".node": "copy"}} or "file". Stackoverflow