instana / nodejs

Node.js in-process collectors for Instana
MIT License
69 stars 36 forks source link

[Bug]: Lambda Execution Hangs After Handler Response Return When Using Extension With No API Key #976

Closed rod-dvla closed 10 months ago

rod-dvla commented 10 months ago

Problem Description

When using the Extension with no api key configured (tested with Node 18) lambda invocation takes an additional 1.2 seconds. Its worth noting that the time taken for the request handler to return response to a caller does not increase, but after the handler returns response the lambda hangs for a period of time.

Was able to reproduce the issue with a vanilla handler on new serverless app. Two handlers, one with and the other without the layer.

Handler Config:

Screenshot 2023-12-18 at 17 28 23

Screenshot 2023-12-18 at 17 28 47

Invocation times with extension:

Screenshot 2023-12-18 at 17 27 52

Invocation times without extension:

Screenshot 2023-12-18 at 17 28 01

I found this issue in our production environments, when disabling the lambda extension. In this case there is no reason for data to be sent to the extension, nor for the extension to be holding the handler for an additional period of time.

Hope you are able to help, many thanks and all the best.

Short, Self Contained Example

Create New Serverless App (Node18), two seperate handlers. Configure one lambda to use the layer (tested with layer version 168 and 171). Set LAMBDA_HANDLER env var pointing at your handler and using instana-aws-lambda-auto-wrap.handler, no other environment variables set.

The other configured to use the handler directly (no Instana extension or env variables either).

Call each and note the difference in lambda duration (not api latency).

Node.js Version

Node 18

package.json

No Package.json required.

package-lock.json

No package-lock.json required.
aryamohanan commented 10 months ago

@rod-dvla Thanks for your report. We are taking a look.

aryamohanan commented 10 months ago

@rod-dvla Based on your message, we understand that when you disable the lambda extension, the duration significantly increases, causing the handler to be held for an additional period. Please correct us if we are wrong.

We have been unable to replicate the issue on our end. Could you kindly share the logs to help us gather more information?

Additionally, could you provide information on the steps you take to disable the extension?

Here's what we've attempted so far:

  1. Created a lambda function (node18) without the extension and no api keys in CommonJS.
  2. Created a lambda function (node18) with the extension and no api key in CommonJS, setting the LAMBDA_HANDLER to instana-aws-lambda-auto-wrap.handler, with no other environment variables configured.
  3. Invoked the endpoints multiple times and monitored the durations.

Upon reviewing the logs, we have not identified the reported issue.

Note: To produce logs, you can utilize the environment variable INSTANA_DEBUG with the value set to debug.

kirrg001 commented 10 months ago

@rod-dvla Do you experience that under load?

rod-dvla commented 10 months ago

@kirrg001 I get this on each request regardless of load.

@aryamohanan Will look to set INSTANA_DEBUG and get the info back to you today.

rod-dvla commented 10 months ago

@aryamohanan Have collected log output for an endpoint with extension and INSTANA_DEBUG set to debug:

log-events-viewer-result(8).csv

For reference have also included the log output for function without.

log-events-viewer-result(9).csv

This is for a serverless stack with 2 files:

serverless.yml

service: basic-instana-extension-test

frameworkVersion: "3"

provider:
  name: aws
  runtime: nodejs18.x
  region: eu-west-2

functions:
  hello:
    handler: handler.hello
    events:
      - httpApi:
          path: /instana/without
          method: get
  with-instana-extension:
    handler: instana-aws-lambda-auto-wrap.handler
    environment:
      LAMBDA_HANDLER: handler.hello
      INSTANA_DEBUG: debug
    layers:
      - arn:aws:lambda:eu-west-2:410797082306:layer:instana-nodejs:171
    events:
      - httpApi:
          path: /instana/with
          method: get

handler.js

'use strict';

module.exports.hello = async (event) => {
  return {
    statusCode: 200,
    body: JSON.stringify(
      {
        message: 'Go Serverless v1.0! Your function executed successfully!',
        input: event,
      },
      null,
      2
    ),
  };

  // Use this code if you don't use the http event with the LAMBDA-PROXY integration
  // return { message: 'Go Serverless v1.0! Your function executed successfully!', event };
};

Note the duration of the lambda is much higher (1500sec odd). From what I've seen api response latency is roughly the same with/without extension, but the lambda duration is a lot higher on the version with extension.

kirrg001 commented 10 months ago

I am able to reproduce the problem when I increase the memory to 512mb. We will work on a fix 👍

aryamohanan commented 10 months ago

@rod-dvla We appreciate your understanding regarding the delayed response, as our team was on vacation last week. Currently, we are diligently working on resolving the issue, and we will notify you promptly once the fix is implemented. In the interim, should you wish to disable the extension, you can achieve this by setting the environment variable INSTANA_DISABLE_LAMBDA_EXTENSION to a non-empty string.

aryamohanan commented 10 months ago

Hi @rod-dvla We've released fix in v3.1.1. Kindly verify whether the issue has been addressed. Thank you

rod-dvla commented 10 months ago

Hi @aryamohanan, am taking a look now. Will get back to you shortly with results.

rod-dvla commented 10 months ago

Hi @aryamohanan, apologies for the delay. Will post over logs to confirm tomorrow, but am still seeing the same extended duration. As you say I can get around by setting the ..DISABLE.. environment variable. Behaviour seems to be aligned with the scenario below:

This is using layer with arn arn:aws:lambda:eu-west-2:410797082306:layer:instana-nodejs:173, which on start up lists version 3.1.1 as being used.

aryamohanan commented 10 months ago

@rod-dvla Thank you for taking the time to share these details with us. It's important to note that testing without an API key may not produce accurate results as it is a mandatory field.

We truly value your efforts in providing us with this insightful feedback. I'd like to inform you that we have logged an internal bug report to address this issue. While it's currently not our top priority, your input is important to us. If this use case is causing significant issues or impeding your application, we kindly encourage you to reach out to us again.

rod-dvla commented 10 months ago

Thank you for taking a look.