nextauthjs / next-auth

Authentication for the Web.
https://authjs.dev
ISC License
24.26k stars 3.38k forks source link

Cold Boot Performance Issues #1673

Closed mod-flux closed 3 years ago

mod-flux commented 3 years ago

Describe the bug I have observed extended response times for any NextAuth endpoint in the scenario of a cold boot of a serverless function. This is regardless of whatever method or encryption you're using (JWT/database, encrypted and non-encrypted).

These performance issues are in the range of multiple seconds of response time. For context, 'cold boot' occurs in these scenarios:

Steps to reproduce

  1. Fresh Next.js application using the latest version.
  2. Implement next-auth library using the standard implementation
  3. Implement any provider, custom or pre-defined (I've used custom & Google for testing)
  4. Launch on Vercel (or run locally)
  5. Go to any endpoint provided by NextAuth (I've been using <your deployment>/api/auth/session for testing)
  6. Note that on the initial request, it can take upward of ~5 seconds to respond (this is regardless of whether they have a JWT cookie to decode or a database to connect to)
  7. Repeated requests to the same endpoint once the function is 'hot' and available takes under ~100ms

Further to the above, I've noted that when 'hot', the /session endpoint uses 131MB of memory so to optimize serverless costs, we've adjusted our vercel.json for /session to provision itself with 192MB of available memory as opposed to the default of 1GB. In this scenario, the initial requests takes upward of ~9 seconds.

Expected behavior Initial requests on cold functions don't take over a second to respond, especially for those without an active session (e.g. no JWT cookie for stateless sessions).

Screenshots or error logs I did some profiling as to what was taking the majority of the time in the aforementioned scenarios and got the following results from a Vercel function on the /session endpoint using 192MB of memory:

typeorm: 6241.299ms
prisma: 1.078ms
adapters: 6280.308ms
jwt: 480.770ms
parse-url: 0.546ms
logger: 0.304ms
cookie: 1.041ms
default-events: 0.649ms
default-callbacks: 0.378ms
providers: 0.301ms
callback-url-handler: 0.458ms
extend-req: 0.244ms
routes: 318.520ms
pages: 42.476ms
csrf-token-handler: 36.856ms
create-secret: 0.382ms
pkce-handler: 1.179ms
state-handler: 0.491ms
_NextAuthHandler: 19.533ms
session: 0.012ms

Additional context It's clear that most functions/libraries are fairly quick and performant but the real outlier here is typeorm which is part of adapters. This is included in the first line of the server index.js.

I'm not proficient with this library but at a glance, everything in adapters only appears to be being used when someone provides a customer adapter in the options or database is being used to store sessions, neither of which is default to next-auth and need to be explicitly set as stateless sessions is the default.

I tested my theory and removed the import adapters from '../adapters' entirely, deployed and then attempted the /session endpoint again and found it worked. More crucially it reduced the cold start response time to a more manageable ~2 seconds. Furthermore, it reduced the memory usage from 131MB to 94MB, which meant we can reduce our memory allocation even further.

To this end, I propose that it doesn't make sense to include this significant overhead in the majority of cases when users are using the next-auth default config of stateless sessions. May I suggest we look to dynamically import the ./adapters in the server/index.js only in the situations where it's necessary (e.g. user has configured database sessions).

I'm also looking at other areas to optimize this initial load, specifically jwt and routes but clearly this is an improvement on what it was initially.

balazsorban44 commented 3 years ago

This is part of our goals with extracting adapters to https://github.com/nextauthjs/adapters. Removing typeorm from the default imports would drop the package size with 71%(!) image

@ndom91 has an ongoing effort in splitting out the adapters from the core at https://github.com/nextauthjs/adapters/issues/37, and I think he would appreciate any help!

jose is another package that we haven't updated in a while (we use 1.x, when only 2.x and 3.x are maintained anymore). Upgrading it would probably result in some package size reduction as well. We just discussed this with @lluia. Although I don't know how hard this would be. I am happy to accept PRs for it!

mod-flux commented 3 years ago

This is all great stuff, thanks @balazsorban44 and glad to see it's already in progress. I will checkout https://github.com/nextauthjs/adapters/issues/37 and have a peak at jose, which I assume is involved in some way in jwt.js. :-)

ndom91 commented 3 years ago

Yeah we're hoping to be able to get the adapters out of the primary package so users who are just using jwts get the fastest version possible!

Unfortunately there hasn't been a ton of progress lately due to a lack of time on our side, so if your interested we'd be happy for any and all support :)

balazsorban44 commented 3 years ago

Some work has been laid out here just now: https://github.com/nextauthjs/next-auth/pull/1682

mod-flux commented 3 years ago

Some work has been laid out here just now: #1682

This is awesome, thanks for the heads up @balazsorban44 and your size reductions are looking awesome! Shall we close this issue and relate it to #1682? I can mention the outdated version of jose on there aswell so everything is in one thread.

eak12913 commented 3 years ago

Before you folks close this - I just wanted to add that we're experiencing a very similar issue but we're not running our app on Lambda/Vercel - but in EKS and we're seeing the same behavior: The first request always takes " a while" (6-15 seconds observed). We are also seeing periodic multi second delays occasionally. It feels very similar to what @mod-flux describes in their issue but rather than a lambda function cache expiring - it seems like this is present even when lambda is not involved.

We are not really using TypeORM in our application (other than next-auth using it internally). We could consider writing our own DB adapter using something like KnexJS - but what I don't understand is whether next-auth would still load TypeORM even if we use a custom DB adapter?

For reference we're using:

    "next": "^10.0.7",
    "next-auth": "3.4.1",
ndom91 commented 3 years ago

Hey no so if you don't define an adapter at all, i.e. your just using JWTs and not saving user info, then next-auth won't load an adapter at all. If you define your own adapter, it'll only load that one, whatever you define in the adapters object in the config 👍

We have some examples in the docs for writing your own adapter and are in the process of pulling them out into their own repo in nextauthjs/adapters if you want to take a look.

eak12913 commented 3 years ago

your just using JWTs and not saving user info,

Interesting - We are indeed using JWTs - but we are also using two Providers:

By using these two providers next-auth is interacting with our DB. I believe that it's using TypeORM to do this (as we don't specify anything to the contrary in config)

I think the crux of my question above is:

We have some examples in the docs for writing your own adapter

Yep - I think this will likely be the route we have to take - but before we invest the time in writing an adapter I just wanted to double check that my own adapter would bypass having TypeORM.

balazsorban44 commented 3 years ago

Currently typeorm (and prisma as well, actually) is pulled in/downloaded/bundled, no matter if you use a DB or not. (Or even if you use a custom adapter) https://github.com/nextauthjs/next-auth/blob/17b789822de64eb845e1e8e49ea83dbff56344f4/src/server/index.js#L1 https://github.com/nextauthjs/next-auth/blob/17b789822de64eb845e1e8e49ea83dbff56344f4/src/server/index.js#L89

This will hopefully be fixed in #1682, but this will require that you will have to install typeorm yourself if you actually use it.

In case you don't rely on a database (or you will write your own adapter with a much smaller footprint), the bundle size will drastically drop. At least that is the intent. Follow the PR on the progress.

ndom91 commented 3 years ago

@balazsorban44 Yeah it'll get installed and what not obviously, but is it really then also bundled and shipped to the user even if you don't use any adapter?

balazsorban44 commented 3 years ago

checked with webpack-bundle-analyzer, it was in the production build, bundled in the api handler, yes. (see the PR, I have a screenshot before and after)

eak12913 commented 3 years ago

@balazsorban44 - Thanks for the clarification. This makes sense and I think our approach will have to be to wait for #1682 and then to write our own DB adapter.

Would you have a reasonable guess/noncommittal guess as to when that PR might make it into a release?

robert-moore commented 3 years ago

I understand that typeorm is large, but it seems that the big 6 second initial delay is probably from connecting to the DB, not from loading up the module within the serverless context?

mod-flux commented 3 years ago

@robert-moore I'm not connecting to any external DB or source in the context of my original issue and investigation. That 6 seconds is purely booting with what's on the Lambda, no external factors.

kripod commented 3 years ago

Thank you for all these observations!

I’ve just set up connection pooling today (with pgbouncer) and found that cold starts still go way slower than expected, possibly due to the overwhelmingly large size of Lambda functions that next-auth might be responsible for.

rozenmd commented 3 years ago

Random observation: I noticed if I check the request for a session, and check if the session is valid and in the db myself (via pg-promise), cold-boot time goes down to 2.8 seconds for logged in users (likely due to the size of the function).

The pseudocode of my hack is:

if (session in req.cookies){
  session = await check_session_in_db();
} else {
  session = await getSession({req});
}
balazsorban44 commented 3 years ago

So it's a bit hard to track this further down, as the issue at hand is not directly related to code. In a best effort, typeorm and prisma at least won't be bundled in v4, and we aim to keep the number of dependencies as low as possible for a small bundle size. If anyone has a better idea what could cause this issue, please open a new issue with a reproduction. 🙏