Closed EGrun closed 3 years ago
Hi EGrun,
Lambda does not have the X-Ray header until the function handler is run. You should be able to put AWSXRay.captureHTTPsGlobal(require('http')) outside your handler, but using the http client will have to be in the function handler code.
https://docs.aws.amazon.com/lambda/latest/dg/downstream-tracing.html
With captureHTTPsGlobal, the http library itself is patched to capture on every http client initialized. There is not a concept to opt out matching request URLs today. However, you can use captureHTTPs, to enable tracing on a single http client instance, where ever you need the call captured.
Hope this helps, Sandra
Hi Guys,
I'm having the same issue but I noticed we have closed this one and #701 is only a partial fix.
You should be able to put AWSXRay.captureHTTPsGlobal(require('http')) outside your handler, but using the http client will have to be in the function handler code.
This works in node 8.10 but not node 10.x. The exact same code for node 10.x is giving me Error: Missing AWS Lambda trace data for X-Ray. Expected _X_AMZN_TRACE_ID to be set.
Im using the captureHTTPglobal code from the docs and im only using the http client inside the handler.
Can we reopen this as its preventing us from upgrading to node10.
By any chance, are you enabling http keep-alives for the AWS SDK, per https://theburningmonk.com/2019/02/lambda-optimization-tip-enable-http-keep-alive/
We're doing this in a single file that we include for every handler, and we see this issue on cold starts.
@hotgazpacho yup we are, so I guess thats partly the issue but for some reason it did not cause a problem on node 8.
@Imran99 trying to maybe help diagnose here... In what order do you set up:
require('https')
https.agent
setuprequire('aws-xray-sdk-core')
AWSXRay.captureHTTPsGlobal
My thinking here is that the order that these things happen is important.
Thanks @hotgazpacho we have two internal npm packages that do it in this order:
normal aws sdk tracing package:
and another http client with tracing package that does:
So, I initialize thusly:
const AWSXRay = require('aws-xray-sdk-core');
const AWS = AWSXRay.captureAWS(require('aws-sdk'));
AWSXRay.captureHTTPsGlobal(require('https'));
function getKeepAliveAgent() {
const options = {
keepAlive: true,
maxSockets: 50, // from aws-sdk
rejectUnauthorized: true, // from aws-sdk
};
const agent = new https.Agent(options);
agent.setMaxListeners(0); // from aws-sdk
return agent;
}
AWS.config.update({
httpOptions: {
agent: getKeepAliveAgent(),
},
});
Where I'm seeing this error is during cold starts of my Lambda function, wherein I use SecretsManager
outside of the handler in the global scope to get the secrets config once and keep it around for the lifetime of the function instance. This global initialization is a pretty common scenario in Lambda.
Are there any updates on this issue? We are experiencing this as well and with Node 8 being EoL at year end this would become a problem
We are actively looking into this issue and discussing it with the Lambda team. Unfortunately this change would be a fairly substantial refactor for them. In the meantime, we recommend using the non-global HTTP capture API.
If Lambda is unable to complete the change, we will consider adding some kind of string matching to not trace these requests in the SDK. The issue with this is 1) It is an issue with Lambda so that feature would need to be replicated across all X-Ray SDKs and 2) special-case not-capturing with string matching is not exactly a best practice, so we'd like to avoid it if possible. Hope this clarifies some confusion about this issue.
So basically we have to choose between being able to update Lambda functions or being able to use X-Ray.
I think it's obvious what most organizations are going to choose. Can you ask the Lambda team if they can extend the deadline to upgrade Lambda functions to Node 10.x until this is resolved?
Lambda has told us a potential solution is underway. I will update this thread once it is released. If the solution does not work, our team will add the filtering logic to ignore these internal calls before Node 8 EOL is reached.
Lambda has released what should be a fix for this problem globally! Please follow up if this issue persists, otherwise I'll close this issue.
Now actually can directly use env - AWS_NODEJS_CONNECTION_REUSE_ENABLED to 1
, no longer need these couple of lines of code New way to keepalive
I've noticed that the errors have disappeared but I'm not seeing http calls in traces at all anymore, could be doing something wrong on our end haven't had a chance to investigate yet. Anyone else seeing this?
Hi @Imran99, If you are still not seeing HTTP traces please follow up in a separate issue. I'm going to close this issue since no one else seems to be experiencing it anymore.
@willarmiros I'm getting Error: Missing AWS Lambda trace data for X-Ray. Expected _X_AMZN_TRACE_ID to be set.
when upgrading to version 3.0.0-alpha.1
.
I'm using Node v12, and I'm using HTTP keep-alive.
I didn't get it with version 2.5.0-experimental.1
btw.
Hi @chearmstrong,
I'm not able to reproduce this error on a Lambda running node 12 and the X-Ray SDK 3.0.0-alpha.1
. I initialized my HTTPS client in the same way that @hotgazpacho did and made a request with it and everything ran & was instrumented as expected.
Can you please post a snippet that reproduces this issue?
Hi @willarmiros - my code is something like this.
const __AWS__ = require('aws-sdk')
const AWSXRay = require('aws-xray-sdk')
// ...require some other stuff
const AWS = AWSXRay.captureAWS(__AWS__)
// ...setup xray
AWSXRay.captureHTTPsGlobal(require('http'))
AWSXRay.captureHTTPsGlobal(require('https'))
AWSXRay.capturePromise()
// ... other code/helper functions
// ...lambda handler
const handler = async (event, context = {}) => {
return AWSXRay.captureAsyncFunc('handler', subSegment => {
subSegment.addMetadata('body', event.body || {}, 'request')
subSegment.addMetadata('headers', event.headers || {}, 'request')
subSegment.addAnnotation('userId', getUserId(event))
return handleEvent(event)
.then(response => {
subSegment.addMetadata('body', response, 'response')
subSegment.close()
return getSuccessResponse(response)
})
.catch(error => {
subSegment.addAnnotation('error', error.message || 'UNKNOWN')
subSegment.close(error)
return getErrorResponse(error)
})
})
}
Worth nothing - I'm only seeing it on cold starts too.
Hi @chearmstrong,
I resolved the issue by moving AWSXRay.capturePromise()
into the function handler, immediately before your return
statement. Feel free to reopen if you still have issues.
@willarmiros Sorry for the delay - I will give this a try, thanks.
@willarmiros I'm experiencing the same issue - why is it necessary to move AWSXRay.capturePromise()
?
This remains an issue for us, even after moving AWSXRay.captureHTTPsGlobal(require('https'));
into the handler.
Versions of aws-xray-sdk-core
tried:
2.5.0
3.0.0-alpha.1
Before -- we see "Missing AWS Lambda trace data for X-Ray"
const { ApolloServer } = require("apollo-server-lambda");
const xray = require('aws-xray-sdk-core');
xray.captureHTTPsGlobal(require('https'), false);
// ... other stuff
const server = new ApolloServer({ /** config **/ })
exports.handler = server.createHandler();
After -- we still see "Missing AWS Lambda trace data for X-Ray"
const { ApolloServer } = require("apollo-server-lambda");
const xray = require('aws-xray-sdk-core');
// ... other stuff
const server = new ApolloServer({ /** config **/ })
const createHandler = function() {
const handler = server.createHandler();
xray.captureHTTPsGlobal(require('https'), false);
return handler;
}
exports.handler = createHandler();
I resolved the issue by moving AWSXRay.capturePromise() into the function handler, immediately before your return statement. Feel free to reopen if you still have issues.
This worked for me, but only with v3. Thanks.
I am using node-fetch
with X-Ray SDK and could not get it working initially.
Finally https://github.com/aws/aws-xray-sdk-node/issues/143#issuecomment-595580064 solved it, but that seems a bit "annoying trick" 😅 Moving this kind of "initialization code" in the end of handler just feels a bit wrong to me.
Well… since I'm using Middy I refactored this into middleware so I don't feel so dirty 🙂
In case anyone's interested:
// middlewares/instrument.ts
import { Handler } from 'aws-lambda'
import { NextFunction } from "middy";
const AWSXRay = require('aws-xray-sdk');
const instrumentationMiddleware = {
after: (handler: Handler, next: NextFunction) => {
AWSXRay.capturePromise();
next();
},
}
export default instrumentationMiddleware;
// index.ts
import { APIGatewayEvent, Context, ProxyResult } from 'aws-lambda';
const wrap = require('@dazn/lambda-powertools-pattern-basic');
import instrumentationMiddleware from "./middlewares/instrument";
export const handler = wrap(async (event: APIGatewayEvent, context: Context): Promise<ProxyResult> => {
/* ... */
return response;
})
.use(instrumentationMiddleware);
Hi, can you explain why capturePromise() should be inside lambda handler or otherwise it throws 'Missing AWS Lambda trace data for X-Ray. Expected _X_AMZN_TRACE_ID to be set' error (only during lambda's cold start)? In aws doc (https://docs.aws.amazon.com/xray/latest/devguide/xray-sdk-nodejs-httpclients.html) there is no mention of need for capturePromise invocation, but in aws-xray for node readme (https://github.com/aws/aws-xray-sdk-node/tree/master/packages/core#capture-all-outgoing-axios-requests) there is example for capturing axios with capturePromise invoked on the top of a module, which is opposite to this issue's solving. According to this issue's solution capturePromise should be called at the end of a handler, but capturePromise patches promise's then method, so in my opinion it should be called right at the beginning of a handler to capture all requests inside handler - why you recommend it to be called at the end? It's a bit confusing, so can you please explain what is correct way of capturing http(s) requests and why?
Hi @padzikm,
This issue has gotten fairly cluttered with tangentially related problems that we attempted to fix, but the original issue that you're experiencing turns out to still be unresolved, as pointed out in #315. We have investigated with the Lambda team and we've found the issue to be a bug in the SDK. I will be working on getting a fix out and released as soon as possible. I will post further updates in #315.
The problem with calling things like capturePromises
is that it is adding another function to the .then
. Depending on how many calls a lambda gets, it could end up adding a lot of functions to the Promise prototype slowing things down.
function patchPromise(Promise) {
var then = Promise.prototype.then;
Promise.prototype.then = function(onFulfilled, onRejected) {
if (contextUtils.isAutomaticMode()
&& tryGetCurrentSegment()
) {
var ns = contextUtils.getNamespace();
onFulfilled = onFulfilled && ns.bind(onFulfilled);
onRejected = onRejected && ns.bind(onRejected);
}
return then.call(this, onFulfilled, onRejected);
};
}
(commenting here as I'm not sure this capturePromise() issue is related to #315)
As others already stated, it doesn't look correct to have to call capturePromise() within the handler in order to work around that function triggering the misleading "context missing" message when a Promise then() is called outside the handler.
Would it be possible to not have this error message logged under this scenario instead?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs in next 7 days. Thank you for your contributions.
Not stale, still relevant.
Experiencing the same issue with a middy implementation using AWS Xray. If I'm following my cloudwatch logs correctly, it seems that the missing _X_AMZN_TRACE_ID
is only logged as missing during a cold start invocation of my lambda. The curious thing is, is that the error doesn't have any actual discernible impact on the logic we're using (Which is: to create an xray trace url from the _X_AMZN_TRACE_ID
in the lambda's env vars). The url is correctly created, so I believe that the error is coming about as a side effect of some sort when calling .captureAWS()
(this is just a guess, I'm in the process of debugging but I have no real clue where the issue is actually coming from)
If it makes any difference at all, the project which builds the dependencies for the mono repo is using lerna and aws cdk. This issue hasn't popped up in implementations using middy on lambdas bundled using serverless. Currently, I have only seen the issue crop up in lambda's created using nodejs construct from aws-lambda-nodejs. The only other notable difference I have found is that these assets are being bundled with esbuild as opposed to the native bundling used for assets in the aws-lambda
module.
All this is to say, this issue is still persisting, but it's not having an impact on the business logic (that I can find). I would love to get to the bottom of /why/ it's happening, but since it's not seemingly having a noticable effect, I'm left wondering if there is a simple way to silence the log for the missing trace_id.
so I believe that the error is coming about as a side effect of some sort when calling .captureAWS()
@Dgarc359 Do you use any AWS SDK clients to make requests in your initialization (cold start) code? If so, that is the reason you're receiving the error. captureAWS is trying to instrument those requests, but the _X_AMZN_TRACE_ID
env var is not set for cold start code.
@willarmiros Here is the full stack trace of the error
2021-11-22T18:54:21.378Z undefined ERROR Error: Missing AWS Lambda trace data for X-Ray. Expected _X_AMZN_TRACE_ID to be set.
at Object.contextMissingLogError [as contextMissing] (/opt/nodejs/node_modules/aws-xray-sdk-core/lib/context_utils.js:26:19)
at Segment.resolveLambdaTraceData (/opt/nodejs/node_modules/aws-xray-sdk-core/lib/env/aws_lambda.js:93:43)
at Object.getSegment (/opt/nodejs/node_modules/aws-xray-sdk-core/lib/context_utils.js:94:17)
at Object.resolveSegment (/opt/nodejs/node_modules/aws-xray-sdk-core/lib/context_utils.js:73:19)
at features.constructor.captureAWSRequest [as customRequestHandler] (/opt/nodejs/node_modules/aws-xray-sdk-core/lib/patchers/aws_p.js:66:29)
at features.constructor.addAllRequestListeners (/opt/nodejs/node_modules/aws-sdk/lib/service.js:279:34)
at features.constructor.makeRequest (/opt/nodejs/node_modules/aws-sdk/lib/service.js:203:10)
at features.constructor.svc.<computed> [as describeTable] (/opt/nodejs/node_modules/aws-sdk/lib/service.js:685:23)
at main (/node_modules/dynamoose/lib/aws/ddb/internal.ts:30:29)
at func (/node_modules/dynamoose/lib/Model/index.ts:102:60)
at getTableDetails (/node_modules/dynamoose/lib/Model/index.ts:111:10)
at check (/node_modules/dynamoose/lib/Model/index.ts:172:33)
at /node_modules/dynamoose/lib/Model/index.ts:190:3
at new WrappedPromise (/opt/nodejs/node_modules/async-listener/es6-wrapped-promise.js:13:18)
at /node_modules/dynamoose/lib/Model/index.ts:167:30
at /node_modules/dynamoose/lib/Model/index.ts:344:41
I believe that what's happening here is that dynamoose is being used to call dynamoDB, and I believe our Middy implementation is adding the .captureAWSRequest [as customRequestHandler]
part of that stack. It seems there is some interaction between our middy code, xray, dynamoose, and cold starts that hasn't been accounted for. I'm guessing your suggestion would be to define some behavior in which the _X_AMZN_TRACE_ID
isn't attempted to be accessed during cold start?
I resolved my issue. I'm not going to say I 100% understand the underlying interactions that were causing the error to be thrown up, but from my understanding, it seems like Dynamoose during cold start code would try to make a connection to a DynamoDB instance, and our middy implementation was attempting to capture that request and pull the trace id. Like you said, this would cause an issue due to the env var not being there during cold starts.
The issue was resolved when I declared our middy code as an external module in the bundling
prop of the node JS lambda construct. I was originally modifying esbuild to declare our middy code as external, which, to me at least, does the same thing, but it seems like there's something happening under the hood when you declare an external module in the aws-cdk construct as opposed to the actual esbuild node module that was resolved when declaring the module in the construct itself. Can't say more without having a more holistic understanding of the underlying code though
After updating to the 'nodejs10.x' runtime environment, I am now seeing this error message at the start of the cloudwatch logs:
We are using,
AWSXRay.captureHTTPsGlobal(require('http'));
to capture all http requests and this is the source of the error. It is also capturing all calls to 127.0.0.1 as part of the runtime event loop.A fix was to move the captureHTTPsGlobal command inside of the lambda function function handler. Before, it was running from the global module level. We are still getting x-ray results for calls to localhost for the lambda runtime environment. Is there a way to ignore requests to 127.0.0.1?