Closed rubythulhu closed 1 year ago
Hey @rubythulhu, thanks for reaching out!
The reason Datadog shows up in this stack trace (and any stack trace where a module can't be loaded or the runtime pukes on its own shoes) is because our instrumentation functions by requiring your handler code (source). If Node can't load that file, it pukes.
I'm not sure we do anything special here - this same stacktrace could happen if the handler included a syntax error, or similarly imported a file that didn't exist. We don't depend on that file, nor do we require it explicitly.
As you've esbuilt this project, I can't really check where else this is calling from in your own handler code or the dependencies packaged. Can you extract the code around this error and see where require(ssh2)
is being called?
If you remove esbuild from your application, does this still throw an error?
If you remove this library, does it throw an error?
I read your conversation in esbuild and ssh2, and see your comment:
Apparently, due to an issue in AWS lambda where their runtime breaks the try {require} catch {} (in ssh2 itself, and the compiled esbuild version adds another catch{} layer) and causes the runtime error, the only ways to fix this are all different variations of "remove the part where ssh2 tries to load the binary module entirely":
I'm not sure I have more to add here, except that I've extracted the node runtimes into a gist, which may help your investigation.
Thanks again!
I am unable to replicate the issue by doing a simple esbuild-free lambda which also does not contain the datadog layer:
let x;
try { x = require("./foo"); console.log("missing module require ok?") }
catch(e) { console.log("caught error", e); }
console.log("setup complete");
module.exports.handler = async(event) => {
const response = {
statusCode: 200,
body: JSON.stringify('Hello from Lambda!'),
};
return response;
};
Logs say:
Response
{
"statusCode": 200,
"body": "\"Hello from Lambda!\""
}
Function Logs
2023-06-27T22:37:44.324Z undefined INFO caught error Error: Cannot find module './foo'
Require stack:
- /var/task/index.js
- /var/runtime/index.mjs
at Function.Module._resolveFilename (node:internal/modules/cjs/loader:1026:15)
at Function.Module._load (node:internal/modules/cjs/loader:871:27)
at Module.require (node:internal/modules/cjs/loader:1098:19)
at require (node:internal/modules/cjs/helpers:108:18)
at Object.<anonymous> (/var/task/index.js:2:11)
at Module._compile (node:internal/modules/cjs/loader:1196:14)
at Object.Module._extensions..js (node:internal/modules/cjs/loader:1250:10)
at Module.load (node:internal/modules/cjs/loader:1074:32)
at Function.Module._load (node:internal/modules/cjs/loader:909:12)
at Module.require (node:internal/modules/cjs/loader:1098:19) {
code: 'MODULE_NOT_FOUND',
requireStack: [ '/var/task/index.js', '/var/runtime/index.mjs' ]
}
2023-06-27T22:37:44.324Z undefined INFO setup complete
START RequestId: 767dac5b-f54b-419a-aa4e-f4ea2cd13521 Version: $LATEST
END RequestId: 767dac5b-f54b-419a-aa4e-f4ea2cd13521
REPORT RequestId: 767dac5b-f54b-419a-aa4e-f4ea2cd13521 Duration: 10.51 ms Billed Duration: 11 ms Memory Size: 128 MB Max Memory Used: 58 MB Init Duration: 146.56 ms
Request ID
767dac5b-f54b-419a-aa4e-f4ea2cd13521
However, I am able to validate that the esbuild-compiled source does INDEED contain the try{ require } catch{}
code verbatim, without changes, so it seems from the above evidence that neither AWS lambda runtime nor esbuild is causing this problem.
When I have a little more time I will try a standalone test that includes the datadog layer and configuration, to see if i can get this to fail that way, but i'll need to use a more complicated setup than "manually create single-file lambda" probably to get an accurate test.
Okay thanks @rubythulhu that is helpful.
The error from that trace is coming from the cjs
loader, and what's interesting is that we're actually throwing an ImportModuleError
- which is something we only create if an unhandled exception is thrown when loading the user's handler, and the e.code === MODULE_NOT_FOUND
. You can find it here
Is there a chance the catch
clause is re-raising? Or maybe this is coming from some other part of the file? If you don't package with es-build, does this go away?
It'd be really helpful if we could have a reproducible example here, as without seeing your handler logic or the full/unredacted stack trace I don't think I can explore much more.
We may be able to make more progress if you can open a support ticket and share these details privately. Feel free to attach this github thread, and you'll be quickly escalated to me.
Thanks!
Is there a chance the
catch
clause is re-raising? Or maybe this is coming from some other part of the file? If you don't package with es-build, does this go away?
I don't believe so. esbuild does complicate the original code during the compilation process, but:
// node_modules/ssh2/lib/protocol/crypto/build/Release/sshcrypto.node
var sshcrypto_default;
var init_sshcrypto = __esm({
"node_modules/ssh2/lib/protocol/crypto/build/Release/sshcrypto.node"() {
sshcrypto_default = "./sshcrypto-PYHRYRPA.node";
}
});
// node-file:/Users/ruby/work/canopy/canopy-api/node_modules/ssh2/lib/protocol/crypto/build/Release/sshcrypto.node
var require_sshcrypto = __commonJS({
"node-file:/Users/ruby/work/canopy/canopy-api/node_modules/ssh2/lib/protocol/crypto/build/Release/sshcrypto.node"(exports, module2) {
init_sshcrypto();
try {
module2.exports = require(sshcrypto_default);
} catch {
}
}
});
// node_modules/ssh2/lib/protocol/crypto.js
var require_crypto = __commonJS({
"node_modules/ssh2/lib/protocol/crypto.js"(exports, module2) {
"use strict";
var {
createCipheriv,
createDecipheriv,
createHmac,
randomFillSync,
timingSafeEqual
} = require("crypto");
var { readUInt32BE, writeUInt32BE } = require_utils();
var FastBuffer = Buffer[Symbol.species];
var MAX_SEQNO = 2 ** 32 - 1;
var EMPTY_BUFFER = Buffer.alloc(0);
var BUF_INT = Buffer.alloc(4);
var DISCARD_CACHE = /* @__PURE__ */ new Map();
var MAX_PACKET_SIZE = 35e3;
var binding;
var AESGCMCipher;
var ChaChaPolyCipher;
var GenericCipher;
var AESGCMDecipher;
var ChaChaPolyDecipher;
var GenericDecipher;
try {
binding = require_sshcrypto();
({
AESGCMCipher,
ChaChaPolyCipher,
GenericCipher,
AESGCMDecipher,
ChaChaPolyDecipher,
GenericDecipher
} = binding);
} catch {
}
This exception is caught with catch { /* do nothing */ }
in both cases above.
Hi @rubythulhu - thanks again. One thing to check - are you possibly using other datadog tools like datadog-ci or datadog's serverless plugin?
If so, they should be dev dependencies of your project and not included in any runtime code, but those both have dependencies on ssh2
(the plugin depends on datadog-ci), and I'm wondering if somehow something meant for a dev env is being loaded in your Lambda function?
Hey @rubythulhu - since we haven't heard from you for a few weeks and don't have a way to reproduce this issue, I don't think I have a way to move forward with this issue. I'm going to close it for now, but feel free to re-open if you've got an example which reproduces this bug.
Thank you!
Our error comes from these three lines: https://github.com/DataDog/datadog-lambda-js/blob/main/src/runtime/user-function.ts#L215-L217
Expected Behavior
As outlined in mscdex/ssh2#1310, to the ssh2 module, in normal circumstances where datadog-lambda-js is not involved, the bundled sshcrypto dependency is NOT required. If this file is missing,
ssh2
package itself just moves along. However, datadog pukes on its own shoes trying to load this optional module.Actual Behavior
When trying to bundle a lambda with optional runtime dependencies, in a
try {require(something)} catch { /* do nothing */ }
block, datadog lambda layer will attempt to require this file anyway, and cause a runtime failure when it doesn't exist.Steps to Reproduce the Problem
ssh2-sftp-client
to upload a file to any sftp serverSpecifications
Stacktrace
There is a chance that this comes from
esbuild
, as esbuild is part of our lambda compilation step. I will also be submitting a bug report to them.