chanzuckerberg / single-cell-data-portal

The data portal supporting the submission, exploration, and management of projects and datasets to cellxgene.
MIT License
63 stars 12 forks source link

Add proxy for analytics to limit impact of adblockers #6167

Closed ainfeld closed 11 months ago

ainfeld commented 11 months ago

Reason: Plausible found that with their default setting approximately 6% to 26% of users have ad blockers set up such that plausible analytics for cellxgene would automatically get blocked. This means the true traffic to cellxgene is likely higher than plausible reports right now.

Description: To try and limit as much traffic as possible from automatically getting blocked by these ad blockers plausible suggests proxying the script with cellxgene's domain to plausible's domain.

cc: @niknak33

joyceyan commented 11 months ago

Since we already use next.js in our application, we would probably want to proxy the Plausible script by following this guide. I have a tentative PR here for what this would look like. However, I'll be pausing work on this until the trust review is complete. Here is the trust review ticket: https://czi.atlassian.net/servicedesk/customer/portal/155/TR-2819

joyceyan commented 11 months ago

Security has signed off on this use case, but for usage with Cloudfront (instead of next.js) since the Cloudfront proxy can enforce traffic to only be rerouted through HTTPS.

They have asked me to check in with legal on if this violates our existing privacy policy, so I opened a ticket with legal here: https://czi.atlassian.net/servicedesk/customer/portal/167/LEGIT-146

joyceyan commented 11 months ago

Legal and security have both signed off. PR is up here for setting this up with Cloudfront.

joyceyan commented 11 months ago

This seems to be fine in staging, though I did send Plausible a PR to update their documentation since I think it's slightly misleading: https://github.com/plausible/docs/pull/447

GET request to /js/script.js returns a 200

Private Zenhub Image

POST request to /api/event returns a 202

curl -i -X POST https://cellxgene.staging.single-cell.czi.technology/api/event \
  -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36 OPR/71.0.3770.284' \
  -H 'X-Forwarded-For: 127.0.0.1' \
  -H 'Content-Type: application/json' \
  --data '{"name":"pageview","url":"http://dummy.site","domain":"dummy.site"}'
HTTP/2 202
joyceyan commented 11 months ago

Verified this works in prod as well, with the same GET/POST requests.

tihuan commented 10 months ago

Really amazing work! Thanks for the updates here throughout the project too, @joyceyan 🏆🙏