jgaskins / armature

Roda-inspired HTTP framework for Crystal, providing routing, sessions, forms, etc
MIT License
25 stars 1 forks source link

Iron out Session extensibility #1

Open jgaskins opened 6 months ago

jgaskins commented 6 months ago

Sessions are currently difficult to extend to allow for different storage backends. I'm not sure yet what needs to be done to make that easier (or I would've done it), but it should definitely be easier. A couple things that I am sure I don't like:

The only concrete things I'm really concerned about are:

xendk commented 4 months ago

Well, the first thing to consider is: how to deal with concurrency? Reading the session on the start of the request and writing it before sending the response will work as long as the handler is pure Crystal, but quickly gets messy if Crystal calls another service and multiple fibers gets the chance to run.

Is session locking something to consider, or is it just up to the user to deal with? Storing values individually would avoid the problem somewhat, but not totally.

Real atomic sessions probably requires rethinking session from "here's a bag of values" to a service of some kind.

But that's overkill for most real-world usage. Real-time update of session derived variables is only going to complicate most pages without any real gain. In practise, "this is the session data as it looked at the start of the request" is quite sufficient in all but the oddest of cases.

Which means the real problem is writing. Some data is fine just to overwrite, while other is not. A flash message bag comes to mind, if the user loads two pages at the same time, you don't want to have the message from one overwrite the other, even if you don't care which window actually displays them both.

I checked up on PHP to see if they did anything smart, but apparently the session handling code just locks the session for the duration of the request, per default. User code can then close the session early if they have no need for writing. And the redis extension has a long issue because it didn't support locking in the old days and people ran into all sorts of problems.

So I guess that's probably the best bet.

Regarding your wish for typed session objects, I had a thought (and it's probably not thought through):

Imagine Session::Handler and Session::Store as generics that takes a session class (anything json-able). And a Session::register macro that defines HTTP::Server::Context#session to be of a given session class and Armature::Request#session as a way to get at it from a route handler?

Session::Store (well, it's concrete subclasses) just loads and saves the session class, locks/unlocks it, and have a #gc for optional housecleaning. Session::Handler handles the cookie stuff, passes data from/to the Context and the store.

The session class could have an abstract superclass that adds #close, #save, #delete and what else user code might need (#regenerate?) which is forwarded to the handler.

jgaskins commented 4 months ago

Cookie-based sessions, where the session payload is serialized entirely inside the cookie, have the same issue with concurrent session mutations. That sort of session storage is the default in Rails and it looks like it's what is still in use on GitHub.

Storing values individually would avoid the problem somewhat, but not totally.

Depending on the storage backend, we could store them together but query/update them separately.

I've experimented with storing them separately — using a NATS KV, the user_id and csrf token were stored in two separate keys. This meant I didn't need to batch query/update like the Redis store currently does, but then the expiration timestamps are different for keys in the same session. This can be useful if you're storing a short-lived value on that key (redirect_uri after a successful login, one-time password, flash message, etc) but also surprising in some awkward ways.

In Redis, at least, we have several options and I've been considering providing different session backends to allow someone to choose which scenario they want to optimize for:

the current "fetch everything, mutate, then store everything" approach

Another thing to consider is that differences in capability between implementations may require differences in APIs between them. For most cases, it's probably fine because the Crystal type checker can automatically downcast variables from abstract types to concrete types if there's only one concrete subclass and there isn't really benefit to loading multiple session storage implementations. I saw this in my NATS KV-based session store experiment, which only mapped strings to strings, where session["user_id"]? returned a String? at compile time — I didn't have to unwrap a JSON::Any. However, for testing, it may be a good idea to have an in-memory implementation for performance and to avoid throwing a bunch of test data into a persistent data store, so the APIs there do need to match.