Closed kwongkz closed 4 years ago
Hi @kwongkz, This doesn't sound like an error associated with X-Ray. Please provide a code snippet to reproduce this error as well as logs or another indication that X-Ray is causing this so I can further assist you.
Hi @willarmiros,
I guess I found out the issue & solution already.
I did reference this issue to get the idea #143
So the solution for me add env - AWS_NODEJS_CONNECTION_REUSE_ENABLED=1
, you can refer to this Lambda optimization tip.
Hope we can put this in the documentation so other people can get more clear to use it.
Thanks for the assist.
I'm having the same issue.
Searching the internet for that IP address leads me to believe it's the IP address of the X-Ray daemon (see here). It definitely should be a "local" IP address because it starts with 169. So I think it should be somewhere in the AWS network, though I'm no network specialist.
I've tried adding the AWS_NODEJS_CONNECTION_REUSE_ENABLED
environment variable and set it to 1
. But I'm still getting the issue. I will further investigate and update if I find anything.
Hi petermorlion,
Is this affecting you on Lambda as well? The Daemon should be automatically configured in Lambda, no need to include it or set it up.
Hi awssandra,
Yes, this is on AWS Lambda. I have it with several Lambda's, all of which use the AWS X-Ray Express package. Strangely, I don't seem to be having the issue when only using the core, but also not when using AWS X-Ray Express with NestJS (both in Lambda's). Though those Lamba's are executed less often. I'll see if I can write a minimal Lambda and execute a load test on them?
I am also seeing this error in x-ray traces. I am using the express package, node 12.x
I've been able to reproduce this on Node 10.x with this piece of code:
const AWSXRay = require('aws-xray-sdk-core');
const xrayExpress = require('aws-xray-sdk-express');
const express = require('express');
const serverlessHttp = require("serverless-http");
module.exports.handler = async function(event, context) {
const app = getApp();
const slsHttp = serverlessHttp(app);
const result = await slsHttp(event, context);
return result;
}
function getApp() {
const app = express()
app.use(xrayExpress.openSegment('PMO-xray-error-test'));
app.get('/', function (req, res) {
res.send('Hello World')
})
app.use(xrayExpress.closeSegment());
return app;
}
I just invoked the API Gateway several times from the AWS Console. So this is no heavy load test, i.e. no concurrent requests. As you can see, some invocations have no issue, but others do:
After that, other requests work fine again. So there's no real pattern I can deduce.
Things I've tried but that didn't make a difference:
httpOptions.agent
of the AWS config with keepAlive
to true and maxSockets
to 50Sorry for the delayed response!
I'm thinking there's a disconnect between the custom Lambda code and the Express middleware. Each have their own expected workflow of the daemon and SDK behavior. We'll take a deep dive into this.
Hi @petermorlion, I am investigating this issue with the Lambda team. Please sit tight for any updates!
@willarmiros I don't mean to put pressure on you, but I'm curious if there is any progress on this?
Hi @petermorlion,
After some further inspection, it appears the root cause is in our service connector here. It's from a poller that runs in the background to retrieve sampling rules from X-Ray's service back end roughly every 5 minutes (speaking of patterns you should see the error about every 5 minutes if you're consistently making requests uninterrupted by cold starts). These requests are attempting to communicate directly with the daemon, which is not possible in Lambda environments.
I changed the way we make these requests in #255 to no longer be lazy, and that actually appears to have made the errors appear instantly upon invocation. I'm going to make a PR to disable these requests for now in Lambda environments, since we don't support sampling configuration in Lambda yet.
These requests are attempting to communicate directly with the daemon, which is not possible in Lambda environments.
@willarmiros Why isn't this possible? Is the X-Ray service on a private network?
Also, when you disable them, how will you be checking for it? This code is essentially express code and express code is not aware that it is being run in a lambda environment.
In the meantime, will I be able to stop the api call with,
AWSXRay.middleware.disableCentralizedSampling();
?
Hi @avin-kavish, Sorry, that wasn't entirely correct. It is possible to communicate with the daemon in Lambda environments, but only over UDP connections (see segment_emitter). The problematic requests that we're making use TCP under the hood. I believe that the Lambda service has some tight iptables configurations prohibiting these.
I will check for Lambda environments using the LAMBDA_TASK_ROOT
environment variable, which is how we make the check elsewhere in the SDK. The disableCentralizedSampling()
call should prevent these errors, that's a great call out. However to minimize the burden on other customers I'll still just disable sampling by default in Lambda until we get better support for it.
This fix was released in v3.0.0-alpha.2.
Anyone got idea why this happen? I'm using serverless lambda and with X-Ray tracing turned on
WARN Error: connect ECONNREFUSED 169.254.79.2:2000 at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1107:14)