pyodide / pyodide-blog

The Pyodide blog
https://blog.pyodide.org
Mozilla Public License 2.0
13 stars 11 forks source link

Replace Google Analytics with Plausible analytics #33

Closed rth closed 1 year ago

rth commented 1 year ago

Currently, we are using Google Analytics for viewer stats (which in particular are useful to know the browser versions used).

Google Analytics is not great privacy-wise and in particular stores a bunch of tracking cookies image

This means that if we want to be better compliant with various privacy data regulations (including GDPR in Europe), we would at least need to have the cookie popup (which is not helpful).

Instead, this PR switches to Plausible analytics which is lighter, open-source and more privacy focused (in particular doesn't store any cookies). While the platform is open-source, the hosted version is not free, and I'm paying for it as part of my other projects.

Once enabled, aggregated stats would be visible at https://plausible.io/blog.pyodide.org (I made it publicly visible as there is nothing sensitive there)

netlify[bot] commented 1 year ago

Deploy Preview for pyodide-blog ready!

Name Link
Latest commit 0e81a1a8fbfc58502ed32ffd7ac45245612e7653
Latest deploy log https://app.netlify.com/sites/pyodide-blog/deploys/62ff780df6fd380009220b2b
Deploy Preview https://deploy-preview-33--pyodide-blog.netlify.app
Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

rth commented 1 year ago

Do we use analytics when there are hits to console.html too?

Not currently. But I was planning to migrate the docs to Plausible next, and we can also add it to console.html

What about for hits to pyodide.js

JsDelivr CDN has those raw stats. The last time I asked it was a significant cost/month to retrieve because logs are mixed with the other packages they host, there are lots of log data per day, one needs to write a custom extraction pipeline and the compute cost is significant. I opened https://github.com/jsdelivr/data.jsdelivr.com/issues/49 to see if the situation changed.

rth commented 1 year ago

So apparently we are at 300-600k downloaded files per day https://github.com/jsdelivr/data.jsdelivr.com/issues/49#issuecomment-1220629195. Given that the core install loads 8 assets minimum that's 40-70k page loads per day (and also 500 GB/day probably uncompressed). Being on the home page of numpy etc, probably contributes a lot.

For comparison, we have around 300 daily visits in the docs, and twice as much on Github readme some of which end up on the REPL. So the console.html would likely be a very small fraction of total download.