DataDog / datadog-lambda-js

The Datadog AWS Lambda Library for Node
Apache License 2.0
107 stars 35 forks source link

TypeError exception raised when tracing is capturing request/response payloads #464

Closed gavllew closed 7 months ago

gavllew commented 7 months ago

Expected Behavior

A Lambda invocation succeeds, and a trace is sent off to DataDog.

Actual Behavior

The DataDog tracing is throwing an exception after our function code completes successfully, causing the Lambda invocation to fail.

Steps to Reproduce the Problem

  1. Set up API GW and Lambdas using Serverless Framework.
  2. Add serverless-plugin-datadog to send traces/logs to DataDog.
  3. Configure captureLambdaPayload: true as per docs

We saw the issue when we updated to serverless-plugin-datadog v5.55.0; reverting to v5.49.0 (which we were running previously) fixed it again.

Specifications

Stacktrace

2024-01-22T11:03:46.313Z    0f1e108f-1e91-46f6-9512-c46b3b3eaece    ERROR   Invoke Error    {
    "errorType": "TypeError",
    "errorMessage": "Cannot read properties of undefined (reading 'substring')",
    "stack": [
        "TypeError: Cannot read properties of undefined (reading 'substring')",
        "    at tagObject (/opt/nodejs/node_modules/datadog-lambda-js/utils/tag-object.js:37:74)",
        "    at tagObject (/opt/nodejs/node_modules/datadog-lambda-js/utils/tag-object.js:61:17)",
        "    at tagObject (/opt/nodejs/node_modules/datadog-lambda-js/utils/tag-object.js:61:17)",
        "    at tagObject (/opt/nodejs/node_modules/datadog-lambda-js/utils/tag-object.js:61:17)",
        "    at tagObject (/opt/nodejs/node_modules/datadog-lambda-js/utils/tag-object.js:61:17)",
        "    at tagObject (/opt/nodejs/node_modules/datadog-lambda-js/utils/tag-object.js:61:17)",
        "    at tagObject (/opt/nodejs/node_modules/datadog-lambda-js/utils/tag-object.js:61:17)",
        "    at tagObject (/opt/nodejs/node_modules/datadog-lambda-js/utils/tag-object.js:61:17)",
        "    at tagObject (/opt/nodejs/node_modules/datadog-lambda-js/utils/tag-object.js:61:17)",
        "    at tagObject (/opt/nodejs/node_modules/datadog-lambda-js/utils/tag-object.js:61:17)",
        "    at tagObject (/opt/nodejs/node_modules/datadog-lambda-js/utils/tag-object.js:61:17)",
        "    at TraceListener.onEndingInvocation (/opt/nodejs/node_modules/datadog-lambda-js/trace/listener.js:147:35)",
        "    at /opt/nodejs/node_modules/datadog-lambda-js/index.js:235:80",
        "    at step (/opt/nodejs/node_modules/datadog-lambda-js/index.js:44:23)",
        "    at Object.next (/opt/nodejs/node_modules/datadog-lambda-js/index.js:25:53)",
        "    at fulfilled (/opt/nodejs/node_modules/datadog-lambda-js/index.js:16:58)"
    ]
}

From my own digging this looks related to PR #430, where the following code appeared:

return currentSpan.setTag(key, redactVal(key, JSON.stringify(obj).substring(0, 5000)));

The MDN docs for JSON.stringify include this note:

JSON.stringify() can return undefined when passing in "pure" values like JSON.stringify(() => {}) or JSON.stringify(undefined).

I suspect we've been unlucky, and one of our request/response payloads may have an unexpected value at the depth cutoff of 10, leading to a substring method call on undefined.

astuyve commented 7 months ago

Thanks for this report @gavllew, that's super helpful. We'll take a look!

joeyzhao2018 commented 7 months ago

Pending Layer Release

gavllew commented 7 months ago

Thank you for the quick response! I'm not convinced that #465 will fully bulletproof the code though. It has handled the case of obj === null, but that's actually a valid JSON value, so shouldn't have been causing the problem in the first place. I think it's more likely to be an undefined value (or even a function) that's causing the error.

For more context: our Lambda is using Middy middleware to do initial processing of the event before the (simplified) handler is invoked, and the middleware often extends/enhances/mutates the original event object. Unless the DataDog wrapper makes a clone of the original event object, it may be trying to capture the final, mutated event, which could contain both undefined values and functions.

gavllew commented 7 months ago

Thanks @joeyzhao2018!

joeyzhao2018 commented 6 months ago

The v105 version layer should fix this issue arn:aws:lambda:<AWS_REGION>:464622532012:layer:Datadog-Node14-x:105