aws / aws-xray-sdk-node

The official AWS X-Ray SDK for Node.js.
Apache License 2.0
270 stars 155 forks source link

SQS Tracing with AWSTraceHeader #208

Closed rogueai closed 1 year ago

rogueai commented 4 years ago

Hi,

I have an SQS queue subscribed to an SNS topic and in turns triggers a Lambda on new SQS messages.

I've tried following the docs here: https://docs.aws.amazon.com/xray/latest/devguide/xray-services-sqs.html to create segments related to SQS, but so far everything I tried didn't work. I've added a segment this way:

const traceHeaderStr = record.attributes.AWSTraceHeader;
const traceData = AWSXRay.utils.processTraceData(traceHeaderStr);
const segment = new AWSXRay.Segment("SQS", traceData.Root, traceData.Parent);
delete segment.service;
segment.origin = "AWS::SQS";
segment.inferred = true;
segment.addPluginData({
   operation: "SendEvent",
   region: record.awsRegion,
   request_id: context.awsRequestId,
   queue_url: record.eventSourceARN
});

What this produces though is something like this:

Screenshot_2019-11-05_at_10_44_52

As you can see, the SQS segment is created with SNS as a parent, however the Lambda invocation is disconnected.

I've also tried other approaches that didn't work:

Could you advise a possible workaround, to get this working?

Thank you!

0xe1d1a commented 3 years ago

Having the same issue, it makes serverless application monitoring a pain to connect the traces. Sounds like we need to look at a 3rd party :(

tdziwok commented 3 years ago

+1

gokhanoner commented 3 years ago

it's really interesting & sad that a system calls itself as a Distributed Tracing System doesn't have the capability to support one of the most basic & used functionality in the whole AWS Ecosystem, SQS triggered Lambdas, still in 2021. Any plans from the product team, any concrete timeline or a word whether you believe that this is an important feature and/or are you planning to support this??

kfirba commented 3 years ago

+1 eagerly waiting for this 🙂 Keep up the good work!

mahe-work commented 3 years ago

+1. Trying to use tracing for the first time, not impressed by the length of this discussion...

afayes commented 3 years ago

Looking at integrating tracing between SQS and Lambda but not very happy that it is not supported out of the box. As others have mentioned it is a basic requirement for many applications that use AWS and tracing. This issue has also been open for 1 year 7 months

wswoboda commented 3 years ago

Can you at least update the documentation here https://docs.aws.amazon.com/xray/latest/devguide/xray-services-sqs.html so it's clear that this is not working yet. Just spent a day trying to make it work and checking for bugs in my integration.

nikevp commented 2 years ago

+1

clarkee1066 commented 2 years ago

+1

This is a major weakness in XRay and makes it unusable in a microservices or serverless system. The Features page on the AWS site says :

AWS X-Ray supports applications running on Amazon Elastic Compute Cloud (Amazon EC2), .... AWS Lambda,.... It also captures metadata for requests made to Amazon Simple Queue Service and Amazon Simple Notification Service.

But without a fix for this issue it is not possible to trace a request through SQS/Lambda, which is a typical microservices/serverless pattern.

seungyongshim commented 2 years ago

+1

s1mrankaur commented 2 years ago

+1 Is there an update on this yet? We always use SQS for inter-service communication and this just defies the whole purpose of using xRay.

joshgoodwin commented 2 years ago

2 years waiting for what should have been a core feature. This "product" is dead, I'm moving on.

tomaszdudek7 commented 2 years ago

While I agree and share your frustration @joshgoodwin, I think that it is probably not the issue with X-Ray itself but rather SQS making it hard to implement it correctly.

SQS is a very old service and I bet it is challenging to tweak.

tomaszdudek7 commented 2 years ago

Perhaps we could ask @willarmiros whether there's any actual progress on it or not.

dfens1 commented 2 years ago

SQS is a very old service and I bet it is challenging to tweak.

Maybe it is. But 2 years without any progress on what seems to be quite an essential feature??

psimsa commented 2 years ago

While I agree and share your frustration @joshgoodwin, I think that it is probably not the issue with X-Ray itself but rather SQS making it hard to implement it correctly.

I would dare to disagree. Wiring up Jaeger tracing for instance works flawlessly, propagating tracing ID through lambda -> sns -> sqs -> lambda without any issue. It takes extra ~5 lines of code per lambda, but other than that it's fairly easy. So if 3rd party can, why AWS can't?

tomaszdudek7 commented 2 years ago

@ofcoursedude There's a difference between implementing tracing inside Lambdas with a custom code (like you mentioned with Jagger) and making it work with no code involved at all (like enabling X-Ray at API Gateway level). I think what they want to do is to have an option to pass traces through SQS without ANY additional code from our side. That forces them to implement this inside SQS which, as I mentioned, might be harder.

But I'm only playing devil's advocate here.

Anyway, an article about Jaeger tracing inside SQS/Lambdas and how did you set that up with some examples would be a great read, appreciated by many. I looked it up and there are little (if any) resources concerning that topic so far.

psimsa commented 2 years ago

@ofcoursedude There's a difference between implementing tracing inside Lambdas with a custom code (like you mentioned with Jagger) and making it work with no code involved at all (like enabling X-Ray at API Gateway level). I think what they want to do is to have an option to pass traces through SQS without ANY additional code from our side. That forces them to implement this inside SQS which, as I mentioned, might be harder.

I understand that - however, my point is that if I am able to pass a tracing information from one lambda to another through a chain of SNS and SQS through my code that I investigated and PoC'ed in one afternoon, I am really puzzled why there is no progress on this, especially given that the X-Ray SDK actually allows to switch tracing context on other platforms (so you can pass the tracing header in message's metadata like I did with Jaeger) but trying to do that in lambda runtime does not work. I really wonder what makes it so different on this level from "regular" runtimes (in my case .Net). As for the Jaeger solution - yeah I'm trying to make myself write some post somewhere but didn't get to it much. But on a high level, basically proceeed as you would do in regular runtime and pass the header in message's metadata. Then, when your function gets invoked and receives the message, extract the tracing info and use it to continue the previous context, adding a new session. I PoC'd it at work passing from a .Net API service running in EC2 through sns/sqs to lambda and then in a chain of lambda -> sns/sqs -> lambda up to Dynamo.

mccauleyp commented 2 years ago

+1

willarmiros commented 2 years ago

Hi all,

I would first like to apologize that this capability is still not available. We hear you and understand your frustration. While we have been continuously working on delivering this, there have been major setbacks due to the complicated nature of the problem. As some have pointed out, we would like to provide a fully-managed solution for propagating trace context from SQS -> Lambda, however that has proven difficult because it is unclear how to map a batch of (independently traced) SQS messages to a single Lambda invocation (represented by 1 segment). We are still actively working on such a solution, and appreciate the constructive feedback we've received so far. In parallel, we're working with the Lambda team on possible workarounds to this problem. Once either is officially supported, we will update this issue.

imiosga commented 2 years ago

+1

joshuamfrancis commented 2 years ago

+1 this capability is much sought after.

psimsa commented 2 years ago

Hi all,

I would first like to apologize that this capability is still not available. We hear you and understand your frustration. While we have been continuously working on delivering this, there have been major setbacks due to the complicated nature of the problem. As some have pointed out, we would like to provide a fully-managed solution for propagating trace context from SQS -> Lambda, however that has proven difficult because it is unclear how to map a batch of (independently traced) SQS messages to a single Lambda invocation (represented by 1 segment). We are still actively working on such a solution, and appreciate the constructive feedback we've received so far. In parallel, we're working with the Lambda team on possible workarounds to this problem. Once either is officially supported, we will update this issue.

Hi @willarmiros, do you have any news for us?

s1mrankaur commented 2 years ago

+1 . Been a while. Is there an update here?

joaogsleite commented 2 years ago

Any update on this? After 3 years we still can't trace a flow between SQS and a Lambda function?

yvele commented 2 years ago

Here is a workaround inspired from

Everything is done on the Lambda consumer side:

import { Segment, setSegment, utils } from "aws-xray-sdk";

function createLambdaSegment(context, sqsRecord) {
  const lambdaExecStartTime = new Date().getTime() / 1000;

  const traceHeaderStr = sqsRecord.attributes.AWSTraceHeader;
  const traceData = utils.processTraceData(traceHeaderStr);
  const sqsSegmentEndTime = Number(sqsRecord.attributes.ApproximateFirstReceiveTimestamp) / 1000;

  const segment = new Segment(
    context.functionName,
    traceData.root,
    traceData.parent
  );
  segment.origin = "AWS::Lambda::Function";
  segment.start_time = lambdaExecStartTime - (lambdaExecStartTime - sqsSegmentEndTime);
  segment.addPluginData({
    function_arn : context.invokedFunctionArn,
    region : sqsRecord.awsRegion,
    request_id : context.awsRequestId
  });

  return segment;
}

/**
 * Set the current Lambda segment that instruments SQS.
 *
 * This is a workaround for https://github.com/aws/aws-xray-sdk-node/issues/208.
 *
 * Implementation inspired from:
 * - https://dev.to/aws-builders/x-ray-tracing-from-sqs-to-lambda-8md
 * - https://github.com/kyhau/aws-tools/blob/main/X-Ray/xray-sqs-to-lambda/handler.ts
 *
 * @see https://github.com/aws/aws-xray-sdk-node/issues/208#issuecomment-1285169865
 * @param {object} context Lambda [context](https://docs.aws.amazon.com/lambda/latest/dg/nodejs-context.html)
 * @param {object} sqsRecord SQS [record item](https://docs.aws.amazon.com/lambda/latest/dg/with-sqs.html)
 * @returns {object} X-Ray segment for Lambda
 */
export default function setLambdaSegmentFromSQS(context, sqsRecord) {
  const segment = createLambdaSegment(context, sqsRecord);
  setSegment(segment);
  return segment;
}

And you use it like that in your handler file:

import setLambdaSegmentFromSQS from "./setLambdaSegmentFromSQS";

export async function handle(event, context) {
  const segment = setLambdaSegmentFromSQS(context, event.Records[0]);
  try {
    await handleCore(event);
  } catch (err) {
    segment.close(err);
    throw err;
  }
  segment.close();
}

image

And with a proper X-Rray trace group it gets cleaner:

XRayTraceGroup:
  Type: AWS::XRay::Group
  Properties:
    GroupName: my-group
    InsightsConfiguration:
      InsightsEnabled: true
      NotificationsEnabled: false
    FilterExpression: !Sub |
      resource.arn = "${ProducerFunction.Arn}"
      OR resource.arn = "${Queue.Arn}"
      OR resource.arn = "${CollectorFunction.Arn}"      

image

s1mrankaur commented 2 years ago

@yvele Thanks for sharing. Only if there was a timeline to this feature request, we'd have known if it's worth jumping on the workarounds already. This looks great though, Thanks much!

willarmiros commented 2 years ago

Hi everyone - unfortunately we are not able to share any timelines about new features over GitHub. However this issue will be updated the moment we have a publicly available update.

owenmorgan commented 1 year ago

https://aws.amazon.com/about-aws/whats-new/2022/11/aws-x-ray-trace-linking-event-driven-applications-amazon-sqs-lambda/

Assuming this is a resolution?

mgorski-mg commented 1 year ago

https://aws.amazon.com/about-aws/whats-new/2022/11/aws-x-ray-trace-linking-event-driven-applications-amazon-sqs-lambda/

Assuming this is a resolution?

Yes, it is ❤

willarmiros commented 1 year ago

As a couple astute commenters have noticed, today we are thrilled to announce support for linking traces from SQS Queues that are polled by AWS Lambda functions! This functionality is available out of the box for all Lambda functions with Active Tracing enabled triggered by SQS Queues. I'm only the messenger here, so I'd like to thank all of the other engineers who put in so much hard work to deliver this feature!

Learn more here: https://aws.amazon.com/about-aws/whats-new/2022/11/aws-x-ray-trace-linking-event-driven-applications-amazon-sqs-lambda/

Please use the built-in feedback link on the AWS Console, reach out to your AWS Support contact, or of course open issues on the SDK repos to leave feedback for X-Ray, there is more to come! With that, I'll be closing this issue out.

s1mrankaur commented 1 year ago

Great! So glad to see an update here. However, the complete flow mentioned in the topic is still not supported (Lambda -> SNS -> SQS -> Lambda) I believe? I can't see SNS -> SQS flow in my traces @willarmiros

willarmiros commented 1 year ago

Good callout @s1mrankaur - we will continue to track the SNS -> SQS case in this issue: https://github.com/aws/aws-xray-sdk-go/issues/218

We will also update here when that workflow is fully supported :)

willarmiros commented 1 year ago

Hi folks, wanted to cross-post here that today we are happy to announce Amazon SNS support for active tracing with X-Ray! If you have an SNS->SQS->Lambda workflow for example, you will now be able to see that workflow end-to-end for the first time by opting in to active tracing on your SNS topic.

Read more here: https://aws.amazon.com/about-aws/whats-new/2023/02/amazon-sns-x-ray-active-tracing-visualize-analyze-debug-application-performance/

s1mrankaur commented 1 year ago

@willarmiros This is amazing! I am guessing there isn't support for this in a serverless framework yet i.e opting for active tracing on SNS topics?

provider:
  tracing:
    apiGateway: true
    lambda: true

This is what I have sofar. Is there any additional config that I can add/enable to see this in action? @willarmiros

willarmiros commented 1 year ago

The active tracing config is supported via CloudFormation: https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-sns-topic.html#cfn-sns-topic-tracingconfig

I am not sure how that makes its way over to the serverless framework tbh, but if there is a place to put in that request with them I'd love to do so!

s1mrankaur commented 7 months ago

Thanks @willarmiros . I've now posted the feature request here https://github.com/serverless/serverless/issues/12385

dschro-1993 commented 6 months ago

What about DynamoDB Streams and lambdas? Is it anywhere on the roadmap?