Open jgaskins opened 6 months ago
Well, the first thing to consider is: how to deal with concurrency? Reading the session on the start of the request and writing it before sending the response will work as long as the handler is pure Crystal, but quickly gets messy if Crystal calls another service and multiple fibers gets the chance to run.
Is session locking something to consider, or is it just up to the user to deal with? Storing values individually would avoid the problem somewhat, but not totally.
Real atomic sessions probably requires rethinking session from "here's a bag of values" to a service of some kind.
But that's overkill for most real-world usage. Real-time update of session derived variables is only going to complicate most pages without any real gain. In practise, "this is the session data as it looked at the start of the request" is quite sufficient in all but the oddest of cases.
Which means the real problem is writing. Some data is fine just to overwrite, while other is not. A flash message bag comes to mind, if the user loads two pages at the same time, you don't want to have the message from one overwrite the other, even if you don't care which window actually displays them both.
I checked up on PHP to see if they did anything smart, but apparently the session handling code just locks the session for the duration of the request, per default. User code can then close the session early if they have no need for writing. And the redis extension has a long issue because it didn't support locking in the old days and people ran into all sorts of problems.
So I guess that's probably the best bet.
Regarding your wish for typed session objects, I had a thought (and it's probably not thought through):
Imagine Session::Handler
and Session::Store
as generics that takes a session class (anything json-able). And a Session::register
macro that defines HTTP::Server::Context#session
to be of a given session class and Armature::Request#session
as a way to get at it from a route handler?
Session::Store
(well, it's concrete subclasses) just loads and saves the session class, locks/unlocks it, and have a #gc
for optional housecleaning. Session::Handler
handles the cookie stuff, passes data from/to the Context
and the store.
The session class could have an abstract superclass that adds #close
, #save
, #delete
and what else user code might need (#regenerate
?) which is forwarded to the handler.
Cookie-based sessions, where the session payload is serialized entirely inside the cookie, have the same issue with concurrent session mutations. That sort of session storage is the default in Rails and it looks like it's what is still in use on GitHub.
Storing values individually would avoid the problem somewhat, but not totally.
Depending on the storage backend, we could store them together but query/update them separately.
I've experimented with storing them separately — using a NATS KV, the user_id
and csrf
token were stored in two separate keys. This meant I didn't need to batch query/update like the Redis store currently does, but then the expiration timestamps are different for keys in the same session. This can be useful if you're storing a short-lived value on that key (redirect_uri
after a successful login, one-time password, flash message, etc) but also surprising in some awkward ways.
In Redis, at least, we have several options and I've been considering providing different session backends to allow someone to choose which scenario they want to optimize for:
String => String
mapping in session data
session["a/b-test.experiments.foo"]?
for A/B testing)user_id
property in a 1M-query pipeline took ~3x as long with RedisJSON as it did when stored in a hashAnother thing to consider is that differences in capability between implementations may require differences in APIs between them. For most cases, it's probably fine because the Crystal type checker can automatically downcast variables from abstract types to concrete types if there's only one concrete subclass and there isn't really benefit to loading multiple session storage implementations. I saw this in my NATS KV-based session store experiment, which only mapped strings to strings, where session["user_id"]?
returned a String?
at compile time — I didn't have to unwrap a JSON::Any
. However, for testing, it may be a good idea to have an in-memory implementation for performance and to avoid throwing a bunch of test data into a persistent data store, so the APIs there do need to match.
Sessions are currently difficult to extend to allow for different storage backends. I'm not sure yet what needs to be done to make that easier (or I would've done it), but it should definitely be easier. A couple things that I am sure I don't like:
Armature::Session::Store::Session
as a constant path.Session
being named twice just feels weird in a way I can't articulate.Session
should not have to set specific instance variables and should ideally lean on methods for theSession::Store
to callabstract def
s for session implementations to define is almost certainly wrong#[]?
to it because you may need to account for session keys that haven't been set yet#[]
at all, tbh — I haven't used it in yearssession[key] = value
call would immediately store that and we wouldn't have to worry about things like dirty tracking or any cleanup at all, reallyHSET
but a storage adapter like NATS KV would store each value into a different NATS KV key because it just stores strings and not more complex data structures.MessagePack
for caching, so maybe that would've been a better idea for session storage since it's faster and uses less memoryThe only concrete things I'm really concerned about are:
Store
is important to keep because it's used as theHTTP::Server
middlewareStore
should be the only thing directly interacting with the backing store (Redis, NATS, Postgres, the filesystem, an in-memory hash, whatever it happens to be)Session
should be simplecsrf
value), we don't store anything about the session