Closed rubythulhu closed 1 year ago
According to the stack trace your posted here, the actual error message is:
Error: Cannot find module 'ssh2'
..which surely comes from ssh2-sftp-client
where it has code require('ssh2')
. So this issue is about why it cannot find the ssh2
package at runtime (did you externalized it?) instead of whether or not the .node
files are bundled/copied to the output.
The ssh2 package is there, and the javascript part of its source is included in the source bundle. That package does a try { require(binary) } catch { /* do nothing */ }
on the underlying binary module. This call should NEVER fail, because it swallows EVERY error.
Yet, at run-time? this fails. We did an upgrade of our software, which caused the binary modules to not be included in software (due to the above-described bug), and the lack of this binary module starts failing at runtime.
I understand that the error message says "can't load ssh2". But I can look at this package contents when it's failing, and a compiled-via-esbuild version of ssh2 exists in the bundle. Meanwhile, if i include the (confirmed via source to be optional) c++ dependency? function works. If my build process leaves it out? it fails reliably throwing an error message that it cannot load a binary module which doesn't exist.
Yes, we have tried external
, that was one of the first things we tried.
There are two situations we can reliably get to happen:
external: ["ssh2"]
-- this "works", ostensibly. It ABSOLUTELY WILL include the ssh2 package. However, node-gyp does not build deterministic binary modules, so every time we redeploy, the hash of our function changes (because inconsequential things get re-ordered during the binary module build, which causes the binary module to be recompiled "the same" functionality-wise but different from a hashing perspective. This would be totally cool, if we were happy with re-deploying this code on every CI run whether or not it had underlying changes, and populating our cdk diff
output with a bunch of "lol C module changed so this whole effing stack changed"sshcrypto.node
file entirely? Despite the try { require(binary) } catch { /* do nothing */ }
call, the esbuild version of the code will FAIL if this require does not work. This is baffling to me, I cannot comprehend what esbuild does that forces this to fail, but my assumption is that the blame lies in a blind AST transform which is unaware of the fact this module is optional, nor that it catches, tries the require, and when the require doesn't work it fails with error at runtime, due to not being able to load the module. (as if catch{}
meant rethrow(error)
, or as if there was never a try/catch in the first place)SST
's install
option: this option takes a list of dependencies, with or without versions, and stuffs them into a fake package.json, then does an npm/yarn install. This also "works", as long as we write a bunch of code to undo whatever esbuild thought was the right thing to do. if we leave esbuild's built source alone? the missing (and, again, "completely optional" according to the pkg owner) get non-deterministically rebuilt every deploy, and re-deploy every time, because their binary module's signature caused the .zip to be different, which forces a redeploy..node
file from the source tree's node_modules to the same place within a built esbuild module's source dir, replacing the "freshly rebuilt for no reason" version with one which is cached by our CI, and therefore doesn't change between builds. this is accomplished by a standalone script which runs as a post-build hook in lambda CI, and all it does is copy the cacheable node_modules version into each build basically blindly. Basically, just let esbuild fucking break the code, then bandaid it afterwards and call it good.I would absolutely appreciate suggested solutions, because everything i've tried is either "doesn't work", or "works, technically, but forces a redeploy every time". Nothing I have tried results in an answer that doesn't fit into the above two buckets.
Thank your for providing more detailed cases. So basically you mean when the .node
file is placed at the right location, then code magically runs well without throwing that module not found error. Without that file, the try-catch block seems not working and throws an error of not finding the whole package.
In that case, I guess this can be a weird behavior of the runtime -- AWS Lambda. I learned before where AWS did hack something in their runtime.
As a workaround, here's another choices to do The only thing that seems to "work"
but could be saner:
If you can patch packages, go edit the ssh2/lib/protocol/crypto.js to remove the try-catch block:
-binding = require('./crypto/build/Release/sshcrypto.node');
+binding = {};
If the above is not applicable and you can use esbuild plugins, create a plugin that do the same thing like above with the on-load callbacks.
If you want esbuild to bundle the .node file in (copy it to the correct place), you can make use of the builtin copy loader --loader:.node=copy
.
I'd rather not have the binary module in the first place, as they cause non-deterministic builds, meaning that every lambda that includes a binary module needs to be re-deployed every single time sst deploy
, sst dev
, or sst diff
runs, because the shasum of the build changes.
We currently use the esbuild-native-node-modules-plugin
package, which rewrites these to load properly. This brought the non-deterministic build problem back, unfortunately. Previously, we had used a post-build hook which searched the current folder for .node
files, and if one was found, overwrite it with the already-compiled version in node_modules
(if it existed), which because our CI preserves/restores node_modules between runs, and developers only re-run yarn install
when an incoming merge modifies package.json
, that solution mitigated this issue most of the time.
The yarn patch
method seems fragile, and the esbuild plugin
method seems like a lot more work (i'm actually somewhat confused as to how esbuild-native-node-modules-plugin
even works after reading the source, it doesn't seem to control re-compiles, but i can md5sum
the original module in <project-root>/node_modules
against the esbuild
version, different shas.
I can try the yarn patch
or plugin
options, but is there a way perhaps to tell esbuild to just use the version in <project-root>/node_modules
and NOT recompile it for every build? That seems like the easiest/cleanest solution from an end-user perspective. I don't want the module to be rebuilt if the original source hasn't changed. We can just use the same one we used last time that already exists and is preserved between runs. The binary module itself isn't the problem. The fact that it gets re-built for each individual package IS a problem.
Binary modules themselves do not cause problems. They load fine when we ship them with our packages. Non-deterministic builds from esbuild DO cause problems -- each individual deployment that uses the binary module ends up changing its sha due to non-deterministic builds. Given, this is because node-gyp
doesn't have consistent/deterministic source-file ordering, but the easy fix is "use the one we already have when applicable, but rebuild if needed".
The compile process is not controlled by esbuild, instead it is the install script run by npm. There's a flag --ignore-scripts
which tells npm to not run any package's install scripts. You may try this out, but it disables other packages too like the one in esbuild, which may cause other issues.
There's a flag --ignore-scripts which tells npm to not run any package's install scripts.
Yeah, this is a non-starter because:
install
flag -- this causes the binary module to not be included, as if it were never re-built in the first place.How does esbuild actually manage this, then? I was under the impression that esbuild compiles together all javascript files, then has some process afterwards. At what point does esbuild cause these modules to be re-installed in the local package, rather than using pre-existing dependencies? Does it re-install ALL node dependencies, whether binary or not, in a temporary place, and then use that to compile the final esbuild-built .js module? If so, why can't it use the pre-existing version in node modules? What causes this to rebuild every time?
Our CI builds all packages only moments before it asks esbuild to build esbuild bundles, without making any changes to source files. Then, esbuild re-builds every binary module for each individual esbuild package. I don't understand why anything needs to be re-npm install
ed during the esbuild run, especially when the immediately prior step in CI is yarn install
.
At what point does esbuild cause these modules to be re-installed in the local package, rather than using pre-existing dependencies?
AFAIK, there's no such process in esbuild itself.
If so, why can't it use the pre-existing version in node modules? What causes this to rebuild every time?
I don't know either, you have to debug this yourself. For example, if you mean your CI does run the install script everytime, maybe you need some cache strategy and custom install logic with --ignore-scripts and run necessary scripts manually.
Another way of disabling specific package's install script can be using the npm:
prefix to replace a package with another but keeping the same folder name, i.e.
"overrides": {
"ssh2-sftp-client": {
"ssh2": "npm:@yourscope/ssh2-fork@version"
}
}
Note: "overrides" is npm specific feature, yarn see this.
AFAIK, there's no such process in esbuild itself.
Fascinating! Perhaps SST is causing this? I'll see if they have any idea.
Thank you for your insight on all of this. This is super frustrating because it adds extra time to every developer's yarn dev
process, and to every deploy to every environment, because any service using ssh2 must be redeployed every time, even if the code hasn't changed in months.
Honestly, node-gyp
is the real problem here. Binary modules get compiled non-deterministically, and it seems like its really just because it generates a Makefile, which for a random binary looks like SomeBinary: Dep1 Dep2 Dep3 Dep4 ...
and every time node-gyp runs the order of Deps changes, so the modules get compiled together in a different order?
I dunno. this has been a really deep rabbit hole. Thank you again.
I'm closing this issue since it sounds like this isn't a problem with esbuild itself.
@evanw i am not so sure that it is not a problem with esbuild; or at least, I do not think the previously given explanation of "i guess the lambda runtime is doing it" is correct.
If i create a lambda with just this:
let x;
try { x = require("./foo"); console.log("missing module require ok?") }
catch(e) { console.log("caught error", e); }
console.log("setup complete");
module.exports.handler = async(event) => {
const response = {
statusCode: 200,
body: JSON.stringify('Hello from Lambda!'),
};
return response;
};
this lambda works, without failing:
Response
{
"statusCode": 200,
"body": "\"Hello from Lambda!\""
}
Function Logs
2023-06-27T22:37:44.324Z undefined INFO caught error Error: Cannot find module './foo'
Require stack:
- /var/task/index.js
- /var/runtime/index.mjs
at Function.Module._resolveFilename (node:internal/modules/cjs/loader:1026:15)
at Function.Module._load (node:internal/modules/cjs/loader:871:27)
at Module.require (node:internal/modules/cjs/loader:1098:19)
at require (node:internal/modules/cjs/helpers:108:18)
at Object.<anonymous> (/var/task/index.js:2:11)
at Module._compile (node:internal/modules/cjs/loader:1196:14)
at Object.Module._extensions..js (node:internal/modules/cjs/loader:1250:10)
at Module.load (node:internal/modules/cjs/loader:1074:32)
at Function.Module._load (node:internal/modules/cjs/loader:909:12)
at Module.require (node:internal/modules/cjs/loader:1098:19) {
code: 'MODULE_NOT_FOUND',
requireStack: [ '/var/task/index.js', '/var/runtime/index.mjs' ]
}
2023-06-27T22:37:44.324Z undefined INFO setup complete
START RequestId: 767dac5b-f54b-419a-aa4e-f4ea2cd13521 Version: $LATEST
END RequestId: 767dac5b-f54b-419a-aa4e-f4ea2cd13521
REPORT RequestId: 767dac5b-f54b-419a-aa4e-f4ea2cd13521 Duration: 10.51 ms Billed Duration: 11 ms Memory Size: 128 MB Max Memory Used: 58 MB Init Duration: 146.56 ms
Request ID
767dac5b-f54b-419a-aa4e-f4ea2cd13521
I have, however, also verified that the lambdas we have that ARE failing retain the try{ require } catch {}
logic.
Sorry, I don't know anything about lambda. I thought it sounded like an external problem from the above thread. If you provide a description of the underlying issue and a self-contained way to reproduce it without lambda involved (e.g. just an esbuild command that you run on a file), I can take a look.
This has something to do with how our esbuild is configured. On stackoverflow the recommendation is to set the {"loader": { ".node": "copy"}}
or "file". Stackoverflow
See mscdex/ssh2#1310 for more info.
I am not sure what package causes this error, I am posting this on multiple projects that are involved here. It could be
ssh2
, it could beesbuild
, and it could besst
.We have some AWS lambdas which are deployed via
sst
and built byesbuild
. These lambdas havessh2-sftp-client
as a dependency, which in turn hasssh2
as a dependency. According to mscdex/ssh2#1310, this module SHOULD be optional, and indeed, the ssh2 package basically doestry { require(binary_module) } catch {}
.esbuild seems to fail in this case, because the
require()
failed, even thought hatrequire()
is caught by thecatch{}
block. We end up with a Lambda that fails RELIABLY when executed because this binary module could not be loaded, and this "optional" module does not exist in the build bundle, and the esbuild-compiled version of this function pukes on its own shoes because it cannot find this module (which is quite clearly optional).