Open ravila4 opened 2 years ago
Update: I have enabled creating anonymous public genesets. It may be beneficial to enforce XSRF cookies if we only want the approved frontend application to do this.
Some other important consideration regarding anonymous genesets:
POST
and PUT
requests to the /user_genesets
endpoint to say, once per second.I'm leaving this open for reference, in case we want to implement other security features.
Note, we also previously discussed using XSRF cookies to prevent other websites/clients from writing anonymous genesets.
However, this would not prevent the automated writing/editing of genesets, which can still be easily done if the user logins with a browser and copies the value of the user_cookie
.
Implementation of XSRF cookie protections
The code that I commented out here would enable XSRF cookie checks on anonymous POST requests to the user_geneset endpoint: https://github.com/biothings/mygeneset.info/blob/master/src/web/handlers/api.py#L44-L49
Currently, the xsrf_token
endpoint provides an html chunk that, when rendered by the frontend, should generate an xsrf_cookie
. The problem is that if it is being rendered inside an iframe, the request may not be able to read the value of this cookie to submit as a custom HTTP header named X-XSRFToken
. We need to check whether it can be accessed from the javascript code that makes the request: https://github.com/biothings/mygeneset.info-website/blob/main/src/api/genesets.ts#L127-L164
Tornado documentation: https://www.tornadoweb.org/en/stable/_modules/tornado/web.html#RequestHandler.check_xsrf_cookie
Copied from https://github.com/biothings/mygeneset.info-website/issues/30
Currently, the website's "Build" page allows users to build and download genesets while logged out. There is also a "Create" button that would allow the user to create an "anoymous" geneset in the database, but this feature is not implemented at the moment.
We need to decide whether we want to support these two Build/Download features, and may need to implement a few things to make this workflows smoother.
Downloading anonymous genesets - This is mostly working, and I think it's a good idea to keep. One thing that could be improved is offloading some of the geneset creation code to the backend. The benefit would be that the data would match exactly what the database would record if the user was logged in. One way to do it is to allow unauthenticated POST requests with the --dry_run flag (This could also be useful for testing). In this case, to download a geneset we simply fetch the JSON from the response, and transform it to csv/tsv/gmx formats if needed.
Creating anonymous genesets - Not supported. Currently returns a 401 Unauthorized Error if user is not logged in. If we don't plan to support it, we should remove the button for logged out users, and update the text under the Login page's "Use As Guest" section. I'm open to discussion into reasons to support it, but I think we would have to address a few questions on the backend namely:
I'll duplicate this issue in the mygeneset backend repository, to track any changes it may require in the backend.