Feature - server secure tokens with Auth headers and jwt

norton120 commented 5 months ago

Is your feature request related to a problem? Please describe. Currently the server auth endpoint has a TODO for the expected auth flow. I'd like to get that wrapped to use jwt auth flow. Also I noticed that the current api keys appear to be stored and looked up as plaintext, which is dangerous and really shouldn't be used in production.

Describe the solution you'd like

Standard fastapi jwt support with api key header auth
a secure hashed api key, specifically making the api key an encoded 2 part key (one for the lookup, one for the auth)
the new keys have a different prefix, so we can continue to support the legacy keys for a period of time

Describe alternatives you've considered Jwt makes the most sense in the context of the chat use case - lots of fast calls and polling, rather than do the full description every trip and expose the api key every time. Alternatively this could be 2 PRs, but I think at the speed of the project's evolution and how they are both really needed to make auth polished, one makes sense.

Additional context Add any other context or screenshots about the feature request here. @sarahwooders I want to go ahead and PR this but I couldn't figure out which issue it belongs to, so I opened a new one. Am I cool to dig into this next?

cpacker commented 4 months ago

@norton120 to summarize what would the general MemGPT API request flow look like for users?

The initial "TODO" scaffolded auth was meant to allow running a service that has auth similar to OpenAI, just without any real auth.

So basically, instead of being the OpenAI server admin, you're the MemGPT server admin.

On the server admin side:

you spin up a server instance
that server instance has some admin account which you can use to view/edit users

On the user side:

you can create an account (missing on MemGPT server, currently must be done by admin)
you can go to an API keys website (missing on the MemGPT dev portal atm, but API routes exist) and generate an API key
now when you make your own requests to the server, you use that API key as a bearer token

^what part of this would change (if any) in the proposed PR?

Also, we're quite open to changing the general API key / auth / user-admin flow, we just thought it made sense to mimic OAI since ideally you can run MemGPT as an OAI replacement.

Really awesome contributions btw, thank you!

Also happy to chat at higher cadence on discord, feel free to DM us there too (github issues also works totally fine, up to you).

norton120 commented 4 months ago

@cpacker thanks!

My thought was to have 2 valid auth flows - one with the API key as the Auth header (so pretty much a drop-in for OpenAI the way you described) and one flow that uses a JWT handshake. The API key flow would stay the same as described above, except new keys (maybe prefixed with sks- for "secret key secure?") would be slightly longer than the current keys. Otherwise business as usual.

The JWT flow would get new routes, and the user experience would look something like this:

API key is set as the Authorization: Bearer sk-xxx header on the initial handshake
Server responds with a refresh JWT that has a long life (say, 1 week?)
User stores the refresh JWT and uses it to get short-lived working JWT
Working JWT is used to sign requests in a JWT header
When working JWT expires, refresh JWT requests a new one Note: there are libraries to obfuscate all this. pyJWT does most of the work and has examples of integrating it. I'd also want to add working examples to the MemGPT docs so those that want to use it have a clear idea of how to do so.^^

Why the JWT?

security: instead of exposing the "true" API key with every request, users expose it once in exchange for "disposable" keys (the JWTs). Working JWTs expire quickly (usually every few minutes) so even if your requests are compromised that exposure is limited.
speed: I think an async pub-sub flow would be ideal for plugging MemGPT server into other applications - i.e. the application publishes messages from users to their respective agents in MemGPT server, and the application subscribes to responses from those agents via polling MemGPT server (this is actually how I have been considering integrating MemGPT into our physical inventory application). In a pub-sub use case, the working jwt cache will reduce overhead from polling so the server doesn't get bogged down with auth.
foundational: Long-term I'm sure more auth schemes will be wanted/needed (Auth0, SSO etc) so establishing an abstraction pattern for auth mechanics is a good future-building exercise.

If this makes sense, I'll start to sketch out some example code so it's easier to discuss. I'll also catch up with you in Discord this week if you're around; I have been in the process of moving for the last week, so I've been away from the keyboard more than usual.

cpacker commented 4 months ago

@norton120 that makes sense to me, thanks for replying in such detail! Re: the pub-sub flow part (somewhat orthogonal to the whole jwt / bearer token thing) - how does this relate to POST SSE vs websockets vs ...? Would the polling be GET SSE requests (if streaming is enabled), and just regular GET otherwise? I guess in the context of a pub-sub polling setup maybe GET SSE doesn't even make that much sense to support (vs GET returning eg {'status': 'in_progress'})

norton120 commented 4 months ago

The more I think about it, the more I think websockets or SSE support is probably all that's needed for now; my original line of thought was that an application publishing/consuming messages to the MemGPT server with a JWT-scoped topic would make pre/post-processing on messages easier, but I don't think it will make a meaningful difference to the downstream implementation at this point. There's a little more overhead for developers not used to working with websockets, but if we've got decent example docs that shouldn't be a barrier to entry.

So maybe this PR would be to square away the auth (secure token and JWT), support those auth methods for ws/SSE, and document usage?

@cpacker thank you for being a sounding board here - I could be convinced either way, I liked the super low barrier to entry of standard REST polling but this feels more and more YAGNI

cpacker / MemGPT

Feature - server secure tokens with Auth headers and jwt #1260