Open markhobson opened 1 month ago
Hi @markhobson thanks for the issue. In the latest version there should not be anything set to storage unless you are modifying the session or have SESSION_REFRESH_EACH_REQUEST set to true. Can you confirm you aren't setting the session at all like was occurring in #232 ?
Are you using a proxy or hosting service to cache your static files? That would normally preclude this issue.
However, regarding not opening the session at all, this is an interesting proposal if there is a use case. I wonder if there is a better way though than providing urls, which would add complexity for a developer, rather than simplifying. Something that would only trigger on read or write to session object.
One idea is re-working how flask works with sessions. Rather than opening a session with each request, the session is filled with an empty session object and the object itself calls the open_session method by overwriting the get/set method and checking if previously accessed or modified. This could be worth experimenting, but it's a pretty fundamental change.
Hi @Lxstr, thanks for the considered reply. I'm not setting the session in each request but I was unaware of SESSION_REFRESH_EACH_REQUEST
defaulting to true. That's a subtle Flask default - it'd be helpful to clarify this behaviour in the Flask-Session docs.
Turning that flag off, it does indeed no longer update the session in the storage backend. It does still open the session on every request which queries the database. The reason I'm trying to minimise hits to the database is to work around #251 - this is an issue running locally for development when using an in-memory SQLite database, hence why a proxy wouldn't be useful here.
Is what you're proposing here a lazy session? One that does nothing until it is get or set. This could be implemented as a decorating SessionInterface
and Session
, a bit like my StaticRequestFilteringSessionInterface
example above. As you say, that could avoid the need to specify details such as request URLs to filter.
You might also be interested in a few other calls to avoid opening a session that I read:
I had a go at the lazy session pattern. This seems to do the trick:
from typing import Any, Callable, Iterator
from flask import Flask, Request, Response
from flask.sessions import SessionInterface, SessionMixin
class LazySessionInterface(SessionInterface):
def __init__(self, delegate: SessionInterface):
self._delegate = delegate
def open_session(self, app: Flask, request: Request) -> SessionMixin | None:
def open_session_delegate() -> SessionMixin | None:
return self._delegate.open_session(app, request)
return LazySession(open_session_delegate)
def save_session(self, app: Flask, session: SessionMixin, response: Response) -> None:
if not isinstance(session, LazySession):
raise TypeError(f"session must be a LazySession: {session}")
if not session.is_open:
return
return self._delegate.save_session(app, session.delegate, response)
class LazySession(SessionMixin):
def __init__(self, open_session: Callable[[], SessionMixin | None]):
self._open_session = open_session
self._delegate: SessionMixin | None = None
def __setitem__(self, key: str, value: Any) -> None:
return self.delegate.__setitem__(key, value)
def __delitem__(self, key: str) -> None:
return self.delegate.__delitem__(key)
def __getitem__(self, key: str) -> Any:
return self.delegate.__getitem__(key)
def __len__(self) -> int:
return self.delegate.__len__()
def __iter__(self) -> Iterator[str]:
return self.delegate.__iter__()
@property
def delegate(self) -> SessionMixin:
if self._delegate is None:
self._delegate = self._open_session()
return self._delegate
@property
def is_open(self) -> bool:
return self._delegate is not None
You can try it out in markhobson/flask-session-issue from #251 with:
...
def create_app():
...
Session(app)
app.session_interface = LazySessionInterface(app.session_interface)
...
Before these changes, putting breakpoints on SqlAlchemySessionInterface._retrieve_session_data
and _upsert_session
and hitting the endpoint twice results in:
retrieve_session_data
upsert_session
127.0.0.1 - - [05/Jun/2024 08:58:29] "GET / HTTP/1.1" 200 -
retrieve_session_data
retrieve_session_data
retrieve_session_data
upsert_session
upsert_session
upsert_session
127.0.0.1 - - [05/Jun/2024 08:58:29] "GET /static/main3.css HTTP/1.1" 304 -
127.0.0.1 - - [05/Jun/2024 08:58:29] "GET /static/main2.css HTTP/1.1" 304 -
127.0.0.1 - - [05/Jun/2024 08:58:29] "GET /static/main1.css HTTP/1.1" 304 -
retrieve_session_data
upsert_session
127.0.0.1 - - [05/Jun/2024 08:58:33] "GET / HTTP/1.1" 200 -
retrieve_session_data
retrieve_session_data
retrieve_session_data
upsert_session
upsert_session
upsert_session
127.0.0.1 - - [05/Jun/2024 08:58:33] "GET /static/main3.css HTTP/1.1" 304 -
127.0.0.1 - - [05/Jun/2024 08:58:33] "GET /static/main1.css HTTP/1.1" 304 -
127.0.0.1 - - [05/Jun/2024 08:58:33] "GET /static/main2.css HTTP/1.1" 304 -
Using LazySessionInterface
then eliminates retrieve and upserting the session for the static resources, as they don't use the session:
retrieve_session_data
upsert_session
127.0.0.1 - - [05/Jun/2024 08:57:59] "GET / HTTP/1.1" 200 -
127.0.0.1 - - [05/Jun/2024 08:57:59] "GET /static/main2.css HTTP/1.1" 304 -
127.0.0.1 - - [05/Jun/2024 08:57:59] "GET /static/main1.css HTTP/1.1" 304 -
127.0.0.1 - - [05/Jun/2024 08:57:59] "GET /static/main3.css HTTP/1.1" 304 -
retrieve_session_data
upsert_session
127.0.0.1 - - [05/Jun/2024 08:58:01] "GET / HTTP/1.1" 200 -
127.0.0.1 - - [05/Jun/2024 08:58:01] "GET /static/main2.css HTTP/1.1" 304 -
127.0.0.1 - - [05/Jun/2024 08:58:01] "GET /static/main3.css HTTP/1.1" 304 -
127.0.0.1 - - [05/Jun/2024 08:58:01] "GET /static/main1.css HTTP/1.1" 304 -
Finally, combining that with SESSION_REFRESH_EACH_REQUEST=False
avoids the unnecessary upsert on the second request:
retrieve_session_data
upsert_session
127.0.0.1 - - [05/Jun/2024 08:57:14] "GET / HTTP/1.1" 200 -
127.0.0.1 - - [05/Jun/2024 08:57:14] "GET /static/main2.css HTTP/1.1" 304 -
127.0.0.1 - - [05/Jun/2024 08:57:14] "GET /static/main1.css HTTP/1.1" 304 -
127.0.0.1 - - [05/Jun/2024 08:57:14] "GET /static/main3.css HTTP/1.1" 304 -
retrieve_session_data
127.0.0.1 - - [05/Jun/2024 08:57:41] "GET / HTTP/1.1" 200 -
127.0.0.1 - - [05/Jun/2024 08:57:41] "GET /static/main1.css HTTP/1.1" 304 -
127.0.0.1 - - [05/Jun/2024 08:57:41] "GET /static/main2.css HTTP/1.1" 304 -
127.0.0.1 - - [05/Jun/2024 08:57:41] "GET /static/main3.css HTTP/1.1" 304 -
Hi @markhobson thanks for the great effort on this. I'm really liking it. I still I have a few PRs to merge but this could be good to add for 1.0. I can see that there would be use for this.
We still would have some implementation details to hash out and ask the core team about it. How would it be best to implement, with a flag or built in directly?
There's some downside to each. On one hand adding more flags means there's a lot of settings and have to change defaults to get the ideal performance. However, directly built in might be too large a change even for a major bump. A third option might be replacing SESSION_REFRESH_EACH_REQUEST with SESSION_LAZY and result in either your first or your final example.
@ThiefMaster @davidism I'd appreciate your thoughts on this one. Is this code a valid/readable approach? Could you see any unintended side effects?
You can find my thoughts in https://github.com/pallets/flask/issues/5491. The author didn't really address any of it. You brought up the same main point: static files should be served by the HTTP server, at which point this is no longer needed.
In production, if performance is an issue, you want to serve your static files directly through your HTTP server, not through Flask. Then this doesn't apply anyway. In other cases, you may want to serve static files with other conditions applied, at which point it's not clear sessions should always be excluded. I don't think it's worth adding the complexity of implementation and explanation to Flask itself, especially when it's already possible to write a custom session (a completely supported and intended public API) to do whatever you want for your case.
Requests for static resources currently cause a session to be opened, which can be costly as the session expiry date is updated in the storage backend. Consider adding a configuration key to bypass opening sessions for static resources.
A workaround is to use a custom session interface, for example:
Configured with: