nearform / get-jwks

Fetch utils for JWKS keys
MIT License
25 stars 16 forks source link

inefficiency of jwk to pem conversion causes performance issues #297

Closed marcin-iwanow closed 1 month ago

marcin-iwanow commented 1 month ago

In our setup, we verify our jwt tokens on an authoriser, which we deploy on AWS as a lambda. We tried to switch to get-jwks but with the config we used till now (128MB of memory) we observed that it's extremely slow: verifying a jwt signed with a EC key was taking 6s on average (!!!) if the key hasn't yet been cached. We traced it back to inefficient performance of the jwk-to-pem package, which was responsible for almost the entire processing time.

I'd suggest to change it, maybe just getting rid of the dependency at all (like jwks-rsa does)? Anyways it's stale for like 2 years now.

simoneb commented 1 month ago

Thanks for reporting, what were you using in your previous flow to do the verification and which didn't exhibit any performance issues?

marcin-iwanow commented 1 month ago

jwks-rsa, and we stay with it for now (after realising that the rsa in the name is a false flag, and they also support EC 😄 )

imdavidmin commented 1 month ago

I might not have the full context here, out of curiosity why do you think the package takes 6s to do this task? It's not related to cold starts on Lambda I assume, and the package running locally would also take that long to process?

I've used jose before for all related JWT stuff, and have also used the web crypto API to do parsing / reading. What's the proposed solution for now if we want to get rid of jwk-to-pem?

simoneb commented 1 month ago

@imdavidmin it seems @marcin-iwanow is suggesting to switch to jwks-rsa. Shall we do a quick experimento to see if we notice any performance boost by doing the switch?

marcin-iwanow commented 1 month ago

well, jwks-rsa is rather an alternative to get-jwks then to jwk-to-pem. I am not sure what they use for the conversion. And regarding the performance, locally I couldn't really reproduce (I was running the lambda in serverless offline and couldn't limit resources). But on aws it struggles a lot with the low setup: I mentioned 128MB of memory (the minimum possible) but they couple it with a cpu config as far as I know, and I expect some cpu expensive operation that can be the bottleneck. For RSA the conversion was already very slow (1.5s?) but for EC it was just not acceptable

imdavidmin commented 1 month ago

I see, so the proposal is to deprecate get-jwks altogether and use a pre-existing library for those needs? In which case, perhaps we'd need to figure out if an alternative we use instead covers all our existing use cases for get-jwks

There's a good comparison of JWT libraries here from jwt.io

imdavidmin commented 1 month ago

Adding in performance tests. Results so far show the normal success path (tests from getJwk.spec.js) are running at sub 25ms (11ms on my local machine).

imdavidmin commented 1 month ago

@marcin-iwanow To diagnose this issue further, did you already test both RSA and EC keys on your local machine with good performance, but the same keys on your Lambda instance was taking very long, with 128MB memory? I'd like to be able to reproduce the issue to debug this further. Thanks

marcin-iwanow commented 1 month ago

on my local machine I haven't seen any significant differences, but also haven't profiled it carefully. (btw note that in your tests you use only RSA keys if I am not mistaken, maybe makes sense to add other ones also, but it's at your discretion ofc :) )

simoneb commented 1 month ago

@marcin-iwanow can you provide a repro please?

imdavidmin commented 1 month ago

@marcin-iwanow I did a test myself deploying it on Lambda, also with 128MB memory. I've set up a dummy domain at https://flat-bush-1e16.david-min-c24.workers.dev/.well-found/jwks.json where you can see a sample EC key. I've then run the Lambda function with the following payload

{
  "domain": "https://flat-bush-1e16.david-min-c24.workers.dev",
  "alg": "ES256",
  "kid": "1"
}

This request was responded to in 291ms, in my first request, so no caching should have been there. I didn't configure anything on the Lambda function differing from default.

So without further information / repo from you, I'm not able to diagnose further. I can put in further test scenarios processing EC keys as well as RSA keys, but this is unlikely to yield significantly different results to the deployed test run.

simoneb commented 1 month ago

Closing as we're missing a repro and we're unable to reproduce. Feel free to reopen in case you can reproduce the issue. Thanks @imdavidmin for looking into it.