middyjs / middy

🛵 The stylish Node.js middleware engine for AWS Lambda 🛵
https://middy.js.org
MIT License
3.71k stars 375 forks source link

Lazy loading and performance #493

Closed stavros-zavrakas closed 4 years ago

stavros-zavrakas commented 4 years ago

We are using middy and lambda functions for a bit of time now and we are wondering if there are any improvements that potentially we can do to improve startup times. I found this interesting issue with some benchmarks in the past and I was wondering if there are any improvements that had been done since then: https://github.com/middyjs/middy/issues/436

We use few middy middlewares and we just tried to do a sample project with the beta versions.

Scenario 1: The startup times of lambdas just with middy and doNotWaitForEmptyEventLoop middleware attached are quite good. This is what we get with this scenario:

Duration: 100.43 ms Billed Duration: 200 ms Memory Size: 128 MB Max Memory Used: 95 MB  Init Duration: 363.62 ms

Scenario 2: The startup times of lambdas just with middy, doNotWaitForEmptyEventLoop and secretsManager are:

Duration: 1578.58 ms    Billed Duration: 1600 ms    Memory Size: 128 MB Max Memory Used: 103 MB Init Duration: 380.81 ms

Scenario 3: The startup times of lambdas just with middy, doNotWaitForEmptyEventLoop, secretsManager and dbManager are:

Duration: 3888.55 ms    Billed Duration: 3900 ms    Memory Size: 128 MB Max Memory Used: 107 MB Init Duration: 362.68 ms

The secrets manager probably can't be improved much because has to fire the request and retrieve all the keys. It already uses a VPC endpoint and stays in the internal AWS network. The same for the database which lives in private subnets and accepts traffic only from lambdas and bastions.

It seems that all the ORM (like sequelize) or query builders (like knex) are doing quite a lot of lazy loading and that's why they spike when they are cold.

Any thoughts for improvements? Something that does not make sense or can be improved on the AWS side?

This profiler might help to identify slow middlewares, it is a bit timeconsuming to add timers and later deploy them to find out what is slow: https://github.com/middyjs/middy/issues/446

This is an example of the lambda function:

'use strict';

const middy = require('@middy/core');
const secretsManager = require('@middy/secrets-manager')
const doNotWaitForEmptyEventLoop = require('@middy/do-not-wait-for-empty-event-loop')
const dbManager = require('@middy/db-manager');

const secrets_field = 'SECRETS';

const {
  environment,
  DB_HOST
} = process.env;

const handler = async (event, context) => {
  let records = [];

  const { db } = context;
  if (db) {
    console.time('query');
    records = await db.select('*').from('recipes');
    console.timeEnd('query');
  }

  const response = {
    statusCode: 200,
    body: JSON.stringify(records)
  };

  return response;
};

exports.get_all_recipes = middy(handler)
  .use(doNotWaitForEmptyEventLoop({ runOnError: true }))
  .use(secretsManager({
    cache: true,
    secrets: {
      [secrets_field]: `${environment}/Secrets`
    }
  }))
  .use(dbManager({
    config: {
      client: 'mysql2',
      connection: {
        host: DB_HOST,
        port: '3306'
      },
      secretsPath: secrets_field
    },
  }));
willfarrell commented 4 years ago

Love the detail you're going into here. I spent a lot of time on this very recently. Some of the discoveries have already been integrated into middy v1 (#436). The rest of the performance comes down to architecture.

Factors to think about:

Below is my middleware with the non-db parts stripped out. Cold starts are <700ms with running a handler that typically does multiple db calls. Subsequent calls are much much faster. Something to benchmark against.

const tls = require('tls')

const middy = require('@middy/core')

// Database
// process.env.DEBUG = 'knex:*'
const ca = require('fs').readFileSync(`${__dirname}/rds-ca-2019-root.pem`)
const pg = require('pg')
pg.types.setTypeParser(1700, parseFloat) // Parse string Decimals

const doNotWaitForEmptyEventLoop = require('@middy/do-not-wait-for-empty-event-loop')
const dbManager = require('@middy/db-manager')

module.exports = {
 handler: middy(handler)
      .use(doNotWaitForEmptyEventLoop())
      .use(
        dbManager({
          rdsSigner: {
            region: process.env.AWS_REGION,
            hostname: process.env.rdsHostname,
            username: 'iam_api',
            database: process.env.rdsDatabase,
            port: process.env.rdsPort
          },
          config: {
            client: 'pg',
            connection: {
              host: process.env.rdsHostname,
              user: dbUsername,
              database: process.env.rdsDatabase,
              port: process.env.rdsPort,
              ssl: {
                rejectUnauthorized: true,
                ca,
                checkServerIdentity: (host, cert) => {
                  const error = tls.checkServerIdentity(host, cert)
                  if (
                    error &&
                    !cert.subject.CN.endsWith('.rds.amazonaws.com')
                  ) {
                    return error
                  }
                }
              }
            },
            useNullAsDefault: true,
            acquireConnectionTimeout: 10 * 1000,
            pool: {
              min: 0,
              max: dbConcurrency
            }
          }
        })
      )
 }
stavros-zavrakas commented 4 years ago

Really nice and detailed response @willfarrell . I wasn't aware of the rds signer, I'm going to have a look at it. I'm still quite impressed with the fact that you were able to bring the cold starts down to less than 700ms. As you see above, we are close to 4 seconds which is not good at all. I wonder if the fact that you are using postgres helps. Maybe pg has less lazy loading modules?

Our layout is:

We'll let you know when we'll improve the cold starts.

willfarrell commented 4 years ago

Personally I split my API into an edge and regional, keeping all DB requests in the same region. This way you only get hit with long latency once, especially when you're doing multiple db calls. It's worth testing imo.

With RDS Serverless, it has an API thay may be faster and removes the need for the RDS Proxy (haven't tested yet because it's not supported for postgres in Canada). You may also be hitting the DB cold start as well if you have that enabled.

Other areas on my list to research:

Keep us posted on how it goes.

willfarrell commented 4 years ago

Here is a quick and dirty postgres-db-manager. Quick test show it's slower than knex + pg, will need to do some digging into how they got the results at https://github.com/porsager/postgres-benchmarks/blob/master/index.js

const postgres = require('postgres')
const RDS = require('aws-sdk/clients/rds')

let dbInstance

module.exports = (opts) => {
  const defaults = {
    client: postgres,
    config: null,
    forceNewConnection: false,
    rdsSigner: null,
    removeSecrets: true,
    secretsParam: 'password', // if `secretsPath` returns an object, ignore value
    secretsPath: null // provide path where credentials lay in context, default to try to get RDS authToken
  }

  const options = Object.assign({}, defaults, opts)

  const cleanup = (handler, next) => {
    if (options.forceNewConnection && (dbInstance && typeof dbInstance.destroy === 'function')) {
      dbInstance.destroy((err) => next(err || handler.error))
    }
    next(handler.error)
  }

  const signer = (config) => {
    if (typeof config.port === 'string') config.port = Number.parseInt(config.port)
    const signer = new RDS.Signer(config)
    return new Promise((resolve, reject) => {
      signer.getAuthToken({}, (err, token) => {
        if (err) {
          reject(err)
        }
        resolve(token)
      })
    })
  }

  return {
    before: async (handler) => {
      const {
        client,
        config,
        forceNewConnection,
        rdsSigner,
        removeSecrets,
        secretsParam,
        secretsPath
      } = options

      if (!config) {
        throw new Error('Config is required in dbManager')
      }

      if (!dbInstance || forceNewConnection) {
        let secrets = {}

        if (rdsSigner) {
          secrets[secretsParam] = await signer(rdsSigner)
        } else if (secretsPath) {
          // catch Secrets Manager response
          if (typeof handler.context[secretsPath] === 'object') {
            secrets = handler.context[secretsPath]
          } else {
            secrets[secretsParam] = handler.context[secretsPath]
          }
        }
        config.connection = Object.assign({}, config.connection || {}, secrets)

        dbInstance = client(config.connection)
      }

      Object.assign(handler.context, { db: dbInstance })
      if (secretsPath && removeSecrets) {
        delete handler.context[secretsPath]
      }
    },

    after: cleanup,
    onError: cleanup
  }
}
const middy = require('@middy/core')
const doNotWaitForEmptyEventLoop = require('@middy/do-not-wait-for-empty-event-loop')
const dbManager = require('./postgres-db-manager')

const ca = require('fs').readFileSync(`${__dirname}/rds-ca-2019-root.pem`)
const tls = require('tls')
const handler = middy(app)
  .use(doNotWaitForEmptyEventLoop())
  .use(dbManager({
    rdsSigner: {
      region: process.env.AWS_REGION,
      hostname: process.env.rdsHostname,
      username: 'iam_developer',
      database: process.env.rdsDatabase,
      port: process.env.rdsPort,
    },
    config: {
      connection: {
        host: process.env.rdsHostname,
        username: 'iam_metadata',
        database: process.env.rdsDatabase,
        port: process.env.rdsPort,
        max: 3,
        ssl: {
          rejectUnauthorized: true,
          ca,
          checkServerIdentity: (host, cert) => {
            const error = tls.checkServerIdentity(host, cert)
            if (error && !cert.subject.CN.endsWith('.rds.amazonaws.com')) {
              return error
            }
          }
        }
      }
    }
  }))
stavros-zavrakas commented 4 years ago

That's very detaild @willfarrell , thanks.

I think part of the solution is the thing that you mention here:

test moving the DB connection (and do select 1) outside the middleware event during the lambda pre-run phase (https://medium.com/@hichaelmart/shave-99-93-off-your-lambda-bill-with-this-one-weird-trick-33c0acebb2ea)

Ideally, the initialization of the secrets manager should happen outside the lambda handler. Like that we can initialize the connection outside the lambda function as well. Now I can understand because of the async JavaScript fashion can't be achieved quite easily. How bad can be to use this module to connect to secrets manager and get the credentials outside the handler? https://www.npmjs.com/package/sync-request