Each API request generates a new session cookie, which:
Makes requests a bit bigger (~200 bytes)
Makes resolver gateway not take full advantage of its caching capabilities
In flask we have a config variable SESSION_REFRESH_EACH_REQUEST we could set to False (default is True) to avoid constantly setting a new session for each request:
SESSION_REFRESH_EACH_REQUEST config key controls the set-cookie behavior. If set to True a permanent session will be refreshed each request and get their lifetime extended, if set to False it will only be modified if the session actually modifies.
In summary:
if SESSION_REFRESH_EACH_REQUEST is True (default), it will always set a cookie with a new expire time
I didn't find any place in adsws where we change the content of the session, but unfortunately we are using extensions that modify the session (e.g., https://github.com/maxcountryman/flask-login/blob/528e1efc69162b433825c4c02f62a1226eadc570/flask_login.py#L406) even if the same value is already stored in the same key, but this is enough for flask to interpret that the session has been modified and a new cookie should be issued despite of having SESSION_REFRESH_EACH_REQUEST set to False.
In any case, we would not want to set SESSION_REFRESH_EACH_REQUEST to False and have all the sessions expired after PERMANENT_SESSION_LIFETIME (which is set to 1 year right now), not getting extended even if the user has been using it through that year:
Sending the cookie every time (the default) can more reliably keep the session from expiring, but uses more bandwidth.
Another option discussed in the team is to change from client-side sessions to server-side sessions using Flask Session. An implementation can be found in this branch of my fork. But setting SESSION_REFRESH_EACH_REQUEST to False and using Flask-Session to store sessions on the server side (redis), still generates API responses with:
Set-Cookie: session=d651d4fa-8097-4c3c-993d-d66439ba637c.xAaBX8w18P_MulSGUxdIsmejctA; Expires=Wed, 24-Jun-2020 21:10:58 GMT; HttpOnly; Path=/
Given that some extensions modify the session as mentioned above, flask considers that the session needs to be saved which triggers the set-cookie (in turn, it also updates the expiration date sent to the browser).
Thus, current scenario:
new session cookie after each request
link gateway cannot cache as much as we would like because of the constantly changing cookies
no server side storage, so no connections to database/redis (all the info is encrypted in the cookie)
[same] same session cookie, but still responding requests with set-cookie with and updated expiration date (ends up being like the current scenario)
[better] link gateway could cache plenty of requests since session cookie does not change
[bad but a one time thing] current user's cookies will be invalidated if we deploy this to production (we break the history of anonymous users)
[worse] server side storage, connections to redis for every request (we add a small overhead to each request)
[worse] if redis is restarted or if keys are deleted manually (e.g., to free memory), we will lose history of anonymous users
In terms of security, the data stored client-side is encrypted and only adsws can read or alter it, so there is no risk of users manipulating data to become another user and going server-side does not suppose a particular advantage in this front.
Regarding the "small overhead to each request" point, some rough estimates:
Currently we are spending 100ms for link gateway requests to check sessions with adsws (internal HTTP request)
With the existing modifications, we would pay 10ms extra for all API requests (BBB and non-BBB, given that we always return back a set-cookie) because the session needs to be read/write from/to redis
cookies are constantly exchanged in both scenarios right now (~200 Bytes), in the future we would fix this
Fixing the cookies being constantly exchanged because there are extensions that "modify" its content is something that could be done without introducing server side sessions. If we modify the code to only send cookies every N days (their expiration is 1 year, we have a large margin) or when they really change, then we have fairly stable sessions that would remove the 100ms tax in link gateway, the 10ms tax for every API request and the ~200 Bytes exchange in every request
A non-exhaustive list of options:
1) Keep the default behavior which ensures sessions do not expire even if it uses more bandwidth, and keep redis caching in resolver gateway for the few cases where it will be useful (e.g., users that click multiple times the same fulltext link, users that open multiple tabs, users that use BBB in slow connection mode)
2) Keep the default behavior which ensures sessions do not expire even if it uses more bandwidth, and disable redis caching in resolver gateway
3) Modify the current adsws and its extensions to avoid unnecessarily updating the session, and only update the expiration time every N days (we have a large margin given that the cookie expires in 1 year), keeping client-side sessions
4) Use the existing modifications to have server-side sessions and modify the current adsws and its extensions to avoid unnecessarily updating the session as mentioned in the previous point
I may have overlooked pros/cons, but so far I cannot think of a reason to move from client-side to server-side sessions. On the other hand, I see how we would benefit from making sessions more stable while still being client-side. With the current picture, option 3 seems a good option in my opinion. In any case, given the small estimated improvements and having other bigger issues to tackle, this task or discussions around it are considered lower priority at this moment.
Each API request generates a new session cookie, which:
In flask we have a config variable
SESSION_REFRESH_EACH_REQUEST
we could set toFalse
(default isTrue
) to avoid constantly setting a new session for each request:In summary:
SESSION_REFRESH_EACH_REQUEST
is True (default), it will always set a cookie with a new expire timeSESSION_REFRESH_EACH_REQUEST
is False, it won't set a cookie unless something in the session has changed (i.e., the app has updated something in the session): https://github.com/pallets/flask/blob/1351d0a56580df36872b466eb245e7634c20dab5/src/flask/sessions.py#L283I didn't find any place in
adsws
where we change the content of the session, but unfortunately we are using extensions that modify the session (e.g., https://github.com/maxcountryman/flask-login/blob/528e1efc69162b433825c4c02f62a1226eadc570/flask_login.py#L406) even if the same value is already stored in the same key, but this is enough for flask to interpret that the session has been modified and a new cookie should be issued despite of havingSESSION_REFRESH_EACH_REQUEST
set to False.In any case, we would not want to set
SESSION_REFRESH_EACH_REQUEST
toFalse
and have all the sessions expired afterPERMANENT_SESSION_LIFETIME
(which is set to 1 year right now), not getting extended even if the user has been using it through that year:(http://flask.pocoo.org/docs/1.0/config/#SESSION_REFRESH_EACH_REQUEST)
Another option discussed in the team is to change from client-side sessions to server-side sessions using Flask Session. An implementation can be found in this branch of my fork. But setting
SESSION_REFRESH_EACH_REQUEST
to False and usingFlask-Session
to store sessions on the server side (redis), still generates API responses with:Set-Cookie: session=d651d4fa-8097-4c3c-993d-d66439ba637c.xAaBX8w18P_MulSGUxdIsmejctA; Expires=Wed, 24-Jun-2020 21:10:58 GMT; HttpOnly; Path=/
Given that some extensions modify the session as mentioned above, flask considers that the session needs to be saved which triggers the set-cookie (in turn, it also updates the expiration date sent to the browser).
Thus, current scenario:
Scenario with existing modifications:
In terms of security, the data stored client-side is encrypted and only adsws can read or alter it, so there is no risk of users manipulating data to become another user and going server-side does not suppose a particular advantage in this front.
Regarding the "small overhead to each request" point, some rough estimates:
Fixing the cookies being constantly exchanged because there are extensions that "modify" its content is something that could be done without introducing server side sessions. If we modify the code to only send cookies every N days (their expiration is 1 year, we have a large margin) or when they really change, then we have fairly stable sessions that would remove the 100ms tax in link gateway, the 10ms tax for every API request and the ~200 Bytes exchange in every request
A non-exhaustive list of options:
1) Keep the default behavior which ensures sessions do not expire even if it uses more bandwidth, and keep redis caching in resolver gateway for the few cases where it will be useful (e.g., users that click multiple times the same fulltext link, users that open multiple tabs, users that use BBB in slow connection mode) 2) Keep the default behavior which ensures sessions do not expire even if it uses more bandwidth, and disable redis caching in resolver gateway 3) Modify the current adsws and its extensions to avoid unnecessarily updating the session, and only update the expiration time every N days (we have a large margin given that the cookie expires in 1 year), keeping client-side sessions 4) Use the existing modifications to have server-side sessions and modify the current adsws and its extensions to avoid unnecessarily updating the session as mentioned in the previous point
I may have overlooked pros/cons, but so far I cannot think of a reason to move from client-side to server-side sessions. On the other hand, I see how we would benefit from making sessions more stable while still being client-side. With the current picture, option 3 seems a good option in my opinion. In any case, given the small estimated improvements and having other bigger issues to tackle, this task or discussions around it are considered lower priority at this moment.