CSDL-UMD / Rockwell

Rockwell uses the twitter authentication workflow to render a twitter like feed in order to collect information about the users interaction with their feed. It also has an attention check feature to ensure that the user is being observant of their feeds and not simply scrolling through with the intent of finishing quickly.
7 stars 2 forks source link

Use session cookies #64

Open glciampaglia opened 3 years ago

glciampaglia commented 3 years ago

To-do (Feb 1, 2022):

Original issue

Currently, whenever the user reloads the page, we go through the Twitter authorizer again and a new session is created in the database. This multiplies the sessions and the complexity of the database.

Instead, we will set a session cookie (using Python requests). using Pug.

Here is the workflow:

Last but not least, we need to specify that a cookie older than 10 minutes is expired. (This needs to be set in Requests Pug.)

RobertAndion commented 3 years ago

We need to make sure on the twitter credentials expiring in guest access twitter that we redirect back to the authorize (Not sure if this is happening automatically as it is an "edge" case that hasn't been tested.) Thoughts on implementation: -Place a cookie in the authorizer if one does not exist. If one exists (expired or not, leave it as is) -Inside of guest access on each call check our cookie, if it is valid, load from database. If it is not, remake the cookie now and then reload the home-timeline as stated above and check for deleted tweets and handle accordingly. (If our session has expired and our twitter key expired we need to leave the cookie alone and redirect back to the authorizer)

Possible Bug: Using since_id has one edge case that could break it. If we only have 19 tweets in the since_by (one since then was deleted and no new tweets are on the timeline) We will unable to get the full 20.

glciampaglia commented 3 years ago

Adding needs discussion to see how to handle possible bug mentioned above.

RobertAndion commented 3 years ago

We also need to handle the case where a user blocks cookies (Default on some browsers) We need to assume we the cookie is invalid if we do not have a cookie. This way it is safe and assumes we have to do the full workflow. Additionally do we need to ask the users permission to use cookies? (Not sure what the regulation is)

glciampaglia commented 3 years ago

We discussed the edge cases: 1) When calling the home_timeline with since_id parameters, and we find that some IDs are missing, we will treat those as deleted ONLY if they are IDs of older tweets. This ensures that the tweet have been deleted, and not that the home timeline had simply updated completely (the home timeline returns the latest 800 tweets, and so for some people that follow a lot of accounts, it could change completely very often). 2) If the browser blocks cookies, we assume it is the same as if the cookie was always expired. 3) We will include wording in the informed consent statements that says that if you sign in, you agree to having a cookie stored in your computer. (See #70.) 4) Related to this, we need a <noscript> tag with a message saying the browser needs to run javascript. (See #70.)

glciampaglia commented 3 years ago

We discussed the issue again. We figured out that even though the value of the cookie is determined in get_feed, this function is not the one that returns directly to the participant's browser. Instead, it returns a JSON object to Pug, which then uses this JSON to render feed. Therefore, the cookie must be set in Pug not in Python.

saumyabhadani95 commented 3 years ago

The workflow for this is almost complete, but the only problem is that I am not able to get the cookies. In fact I am not able to get the webpage http://127.0.0.1:3000/ (on which Truman feed is running) using either requests or urllib. python requests just freezes and urllib gives error Remote end closed connection without response. However I am able to get http://127.0.0.1:5000/ (on which Twitter authenticator is running).

glciampaglia commented 3 years ago

The reverse proxy settings was out of sync due to a change of IP of the AWS instance, so we haven't had the chance to fix the above problem. Now the reverse proxy is working again, so we should be able to fix this and close the issue.

glciampaglia commented 3 years ago

It seems that the browser is refusing to set the first cookie because it has the "SameSite" attribute set to None, but it is missing the "secure" attribute, and it seems some browsers like chrome do not allow this. Other browsers like Firefox are still allowing it, but with a warning that soon they will reject it, see here for more info: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Set-Cookie/SameSite

We will try to add the "secure" attribute and see if it work. We also want to make sure that the node.js endpoint works with the reverse proxy when serving the feed.

glciampaglia commented 3 years ago

We are now including the "secure" attribute to the cookie, however it is still not being set by the browser because this attribute requires to be received over a secure connection (https).

One possible solution, besides switching to an https connection, could be to fix the domain name, so that the cookie is not seen as from a different domain anymore.

For more information, see here:

https://www.chromium.org/updates/same-site

and here:

https://web.dev/samesite-cookies-explained/

glciampaglia commented 3 years ago

Now that we have attention checks, we have decided to increase the expiration time of the cookie to 30 minutes. This is to make sure that the cookie does not expire during the task. When the cookie has expired, the user will be take back to the start of the task (page 0 and attention 0). This way we ensure that if people navigate back to a stale survey the restart from scratch instead of the resuming of they left. This will not affect the task itself, since it will be shorter than the expiration of the cookie. However it will ensure that users are not able to view tweets that have been deleted in the meanwhile, which was the rationale for having session cookies in the first place.

glciampaglia commented 3 years ago

After fixing the reverse proxy, the cookies now work. Right now the cookie expiration is set to 2 minutes. We need to make sure that the expiration is set to 30 minutes.

saumyabhadani95 commented 3 years ago

Cookie expiration set to 30 mins

glciampaglia commented 3 years ago

Right now, when the cookie has expired, the user is sent back to the first feed (feed 1 of 5) ONLY when they refresh the browser. Instead, we would like things to happen differently:

(move to the top)

glciampaglia commented 3 years ago

We discussed this issue again, and we decided we are going implement this directly in React.

glciampaglia commented 3 years ago

We also want to make sure we use the new batch compliance endpoint when we need to refresh the tweets.

glciampaglia commented 2 years ago

We discussed this again, we decided that at the moment this is not a critical issue, as long as we make sure that participants cannot re-access the app after the survey has completed. This way we should not run into compliance issues. We will move this task to Utopia and re-consider it later on, resources permitting.