adsabs / adsws

ADS web services
Other
2 stars 15 forks source link

Evaluate and improve session cookies #168

Open marblestation opened 5 years ago

marblestation commented 5 years ago

Each API request generates a new session cookie, which:

In flask we have a config variable SESSION_REFRESH_EACH_REQUEST we could set to False (default is True) to avoid constantly setting a new session for each request:

SESSION_REFRESH_EACH_REQUEST config key controls the set-cookie behavior. If set to True a permanent session will be refreshed each request and get their lifetime extended, if set to False it will only be modified if the session actually modifies.

In summary:

I didn't find any place in adsws where we change the content of the session, but unfortunately we are using extensions that modify the session (e.g., https://github.com/maxcountryman/flask-login/blob/528e1efc69162b433825c4c02f62a1226eadc570/flask_login.py#L406) even if the same value is already stored in the same key, but this is enough for flask to interpret that the session has been modified and a new cookie should be issued despite of having SESSION_REFRESH_EACH_REQUEST set to False.

In any case, we would not want to set SESSION_REFRESH_EACH_REQUEST to False and have all the sessions expired after PERMANENT_SESSION_LIFETIME (which is set to 1 year right now), not getting extended even if the user has been using it through that year:

Sending the cookie every time (the default) can more reliably keep the session from expiring, but uses more bandwidth.

(http://flask.pocoo.org/docs/1.0/config/#SESSION_REFRESH_EACH_REQUEST)

Another option discussed in the team is to change from client-side sessions to server-side sessions using Flask Session. An implementation can be found in this branch of my fork. But setting SESSION_REFRESH_EACH_REQUEST to False and using Flask-Session to store sessions on the server side (redis), still generates API responses with: Set-Cookie: session=d651d4fa-8097-4c3c-993d-d66439ba637c.xAaBX8w18P_MulSGUxdIsmejctA; Expires=Wed, 24-Jun-2020 21:10:58 GMT; HttpOnly; Path=/

Given that some extensions modify the session as mentioned above, flask considers that the session needs to be saved which triggers the set-cookie (in turn, it also updates the expiration date sent to the browser).

Thus, current scenario:

Scenario with existing modifications:

In terms of security, the data stored client-side is encrypted and only adsws can read or alter it, so there is no risk of users manipulating data to become another user and going server-side does not suppose a particular advantage in this front.

Regarding the "small overhead to each request" point, some rough estimates:

Fixing the cookies being constantly exchanged because there are extensions that "modify" its content is something that could be done without introducing server side sessions. If we modify the code to only send cookies every N days (their expiration is 1 year, we have a large margin) or when they really change, then we have fairly stable sessions that would remove the 100ms tax in link gateway, the 10ms tax for every API request and the ~200 Bytes exchange in every request

A non-exhaustive list of options:

1) Keep the default behavior which ensures sessions do not expire even if it uses more bandwidth, and keep redis caching in resolver gateway for the few cases where it will be useful (e.g., users that click multiple times the same fulltext link, users that open multiple tabs, users that use BBB in slow connection mode) 2) Keep the default behavior which ensures sessions do not expire even if it uses more bandwidth, and disable redis caching in resolver gateway 3) Modify the current adsws and its extensions to avoid unnecessarily updating the session, and only update the expiration time every N days (we have a large margin given that the cookie expires in 1 year), keeping client-side sessions 4) Use the existing modifications to have server-side sessions and modify the current adsws and its extensions to avoid unnecessarily updating the session as mentioned in the previous point

I may have overlooked pros/cons, but so far I cannot think of a reason to move from client-side to server-side sessions. On the other hand, I see how we would benefit from making sessions more stable while still being client-side. With the current picture, option 3 seems a good option in my opinion. In any case, given the small estimated improvements and having other bigger issues to tackle, this task or discussions around it are considered lower priority at this moment.