Open Kludex opened 1 year ago
As explained in the Chromium bug report, regarding the code snippet:
The attached file is a minimal Starlette (Python) application where the root is an HTML page that initiates a slow (5 seconds) fetch request to
/set_cookie_on_fetch
but navigates to/set_cookie_and_redirect
after 2 seconds, before the fetch operation has had time to complete. Both operation have aSet-Cookie
header with a timestamp and a specific prefix./set_cookie_and_redirect
redirects to a page,/final
, that prints the current value for the cookie in question. Immediately after the redirect, the value begins with "b__", as expected, but waiting for the server to return the result of/set_cookie_on_fetch
and then refreshing again shows that the cookie's value has now changed and begins with "a__"
This issue does not occur with Firefox or Microsoft Edge, which are the two other browsers I have at hand to test this on.
We hit exactly the same issue, and via https://github.com/lepture/authlib/issues/334 found that this was already reported here.
We are developing an OAuth feature for Solara and if we navigate away from the page (to our login endpoint), we also send out a window.navigator.sendBeacon(close_url);
to make sure we clean up the application in case the websocket close does not get detected soon. These two requests happen at approximately the same time, and with the dev console open, the sendBeacon seems to win, effectively resetting the cookie that our login redirect just set.
We made a modification to the session middleware, that assigns to the session object a dict like object that keeps track if any key is modified. We now only send the cookie if the session dict was modified, and that solved our issue. Maybe this implementation makes sense for Starlette as well.
My guess is flask does this as well: https://flask.palletsprojects.com/en/2.2.x/api/#flask.session
Hey, any update? Even doing this isn't a suitable approach since now the response will contain 2 set-cookie
headers:
@app.middleware("http")
async def some_middleware(request: Request, call_next):
response = await call_next(request)
session = request.cookies.get("session")
if session:
response.set_cookie(
key="session", value=request.cookies.get("session"), httponly=True
)
return response
$ curl -IX GET --cookie "session=<XXX>" localhost:8080
HTTP/1.1 307 Temporary Redirect
date: Fri, 02 Jun 2023 08:25:23 GMT
server: uvicorn
content-length: 0
location: /login
set-cookie: session=<YYY>; path=/; Max-Age=1209600; httponly; samesite=lax
set-cookie: session=<XXX>; HttpOnly; Path=/; SameSite=lax
Edit: I ended up switching to starlette-authlib which apparently works correctly, at least in this scenario.
Is there any update on it?
No... 👀
The issue is because the session middleware is always handling every request as needing session data to persist, even if it was already present.
If the scope has data the cookie is set. Then the elif regardless if initial_session_was_empty set to false when the cookie is validated this will erase the cookie.
However fixing this I feel will cause a larger problem and more changes which is why I haven't just submitted a pull request for the change. Adding a couple more cases to the if statement in the send_wraper will temporarily keeps the same session cookie for the lifetime of the session, however that will cause the session to become invalidated after the max_age is reached and the users session will be destroyed.
I feel a complete fix would also need an option for when we want the sessions to refresh beyond the max_age like a refresh window. Like if the current time is with in the max_age minus refresh window, set initial_session_was_empty true (and rename that variable to something more descriptive like needs_persistence or some such) and write a new cookie as long as the current cookie is still valid.
Testing the starlete_authlib.middleware it looks like it is persisting the cookie but taking a quick look at the library I suspect it will have the expiration issue. It's storing the expiration in the session but doesn't appear to be doing anything with it after that.
IMO the main problem with the sesion-cookie storage is that it doesn't have a stable encoding, because itsdangerous (being a kind of proto-JWT with no header) includes the timestamp where the data was signed in its payload, so you can't even go the naive route of comparing the (string) cookie value in the request and the cookie value that the middleware is about to inject in the response to make sure it's not being needlessly re-sent. I mentionned this on the FastAPI side (https://github.com/tiangolo/fastapi/issues/754#issuecomment-1464153820), but I personally believe it would be better to switch to JWT for this and use fixed fields like nbf
and exp
instead of a timestamp+max_age combo, so that the encoded value remains constant if nothing changes.
Obviously the preferable way of going about this would be to skip serializing and signing session data altogether if nothing has changed, but the experiments I've tried make it seem tricky to efficiently and reliably do at the middleware and/or Starlette level.
Using the session middleware with a JWS token provided by an external service was my intention with what I was considering it for. The method which the session is taking to verify if it's tampered with wasn't going to affect me other than this extra layer being provided by the sessions middleware is inefficient.
The persistent work flow in the pull request I logged doesn't use the comparison at all to persist the data. That comparison is to check if the token needs to be updated if the data has changed. The persistence either persists until the token is no longer valid, or if there's a refresh window defined and it needs to re-write the token with a new signature. To be fair all of what I committed for my purposes could be moved out of the session middleware and into it's own separate cookie and handled as a function in each call. The middleware makes that a bit more convenient. All the existing middleware is doing is handling the cookie data and making sure it's not tampered with and doing it with basically the same lifecycle as a JWS token, a signed string.
I'm separating the concept of authentication from session. The session middleware in my opinion appears to be intended to use be used for maintaining session state, not necessarily authentication. However as you alluded to here and in the FastAPI comment, the session middleware cookie is effectively no different than a JWT token. JWS tokens borrowed from the concept that the sessions middleware also borrowed from and expanded on it. But comparing these two in my opinion would be confusing two concepts, session state and authentication since we can have session states with out being authenticated. JWT/JWS tokens are intended to provide an assertion by a 3rd party and verify its authenticity, while sessions are not provided by a 3rd party.
Regarding the cryptography aspects, I had a much longer response written but decided to change direction and simplify it how I believe I understand your concern. A JWT (JWS) is the same concept it's a signed string containing data (sessions cookies as they stand now). The only conceptual difference in respects to a session is the JWT token includes a header. Beyond the header the difference between a JWS token and a itsdangerous timestampsigner session cookie can be the keys, but can also be signed with a secret key and identical.
JWS tokens can either be signed with a private secret (like sessions.py now) or with public/private keys, but the RFC suggests they should be asymmetric. The sessions cookie is signed using HMAC SHA-1 with a secret key, omitting the rotating keys and digest_method options in itsdangerous. The itsdangerous response generated in the sessions library at least is including the timestamp adding some randomness to the string before signing so it's not quite so terrible. According to the standards JWS can be signed any way we like, including using a private secret with HMAC SHA-1. The strength of the secret key in either case is as important as the strength of a password or the private key used, key rotation would improve things but then a key store would be needed. Leaking the secret(s) or a private key is the same risk. Signing and verifying on the same service means that it has access to the private secret, or both public and private keys. If the cookie data was signed using the same method as a JWS token, then the two wouldn't be any different.
So long story short, using a secret key is another concern and should be a separate issue. The difficulty in changing the session data to a stronger signing method is less about implementing it in the library, but more with usage, documentation, and tests. So I feel like this separate problem statement should be: The current implementation is using a Secret key, and no method to change the digest_method from the default HMAC SHA-1. A JWT token wouldn't really be a solution to session security since it has the same problem, and risks adding overhead and complexity that isn't necessarily needed by a first party solution.
Note that our implementation is at https://github.com/widgetti/solara/blob/master/packages/solara-enterprise/solara_enterprise/auth/middleware.py
This is part of a non-open source licensed part of our mono repo, but consider it open source (the implementation is trival anyway). If someone has the time to turn this into a PR, that would be awesome!
Too add, regarding the concern of getting the timestamp, it's in the cookie value string which is base64 encoded. The string has 3 parts, the Payload, Timestamp and Signature separated by a dot '.' which is the default value for itsdangerous. The timestamp if the string is split on the dots would be element 1 (second) so if we just wanted resolve the timestamp it just needs to be decoded, no different than a JWT/JWS token where you need to decode the payload and get for the 'iat' key. So instead of converting the json, we just split the line on dots and grab element 1. If you were entirely uninterested in the validity of the token, decode element 0 to a string to obtain the json string.
def sign(self, value: _t_str_bytes) -> bytes:
"""Signs the given string and also attaches time information."""
value = want_bytes(value)
timestamp = base64_encode(int_to_bytes(self.get_timestamp()))
sep = want_bytes(self.sep)
value = value + sep + timestamp
return value + sep + self.get_signature(value)
I also don't think relying on the timestamp for anything other than verifying the validity of the message is a good idea.
The pull request I created and linked to this discussion has working persistence and refresh. I just haven't created the test cases for the additional features and wanted more discussion about the subject.
@maartenbreddels I intend to take a look at your sample, it looks lot cleaner than what I submitted. So maybe I will make it into a PR :)
Discussed in https://github.com/encode/starlette/discussions/2018