Raku / doc

🦋 Raku documentation
https://docs.raku.org/
Artistic License 2.0
285 stars 293 forks source link

Add Google Analytics code to the website #1862

Closed JJ closed 6 years ago

JJ commented 6 years ago

Since the grant has started officially, one of the things I'd like to do is to have a clearer idea of what visitors do, where they come from, and where we lose them; I'd like to add Google Analytics code to the site and have it working for at least a month. In principle, I'd use my own code, but if there's some official Google account I could use, and whose results I can access, that would be fine. In principle, I would remove it once I finish the grant, but I can leave it there if it helps, or change it to another one. Any ideas for/against?

nxadm commented 6 years ago

I don't know if this something desired.

Users like me frown upon google analytics (or any other type of tracers). Furthermore every single adblocker out there (and DNS setups like pihole) does it job regarding google analytics and blocks it. I presume our users are the type that will at least run adblockers, resulting in severely skewed results.

Wouldn't webserver logs provide all the needed info? E.g. what docs pages are popular?

JJ commented 6 years ago

They might, but I don't have access to the server. For the time being there's just the breakdown of visited pages, but I need a deeper analysis: how many pages visited per session, landing page, that kind of things. I could include it temporarily, during the study, and eliminate it once I get an idea of what's going on.

nxadm commented 6 years ago

Not vetoing anything (or in the position to do :) ), just a heads up that privacy aware users don't know if their info is used correctly or not.

Looking forward to your work!

JJ commented 6 years ago

Someone in a privacy list I participate has suggested this one https://matomo.org/ It's got a 30-day free trial period. Could be enough for the purpose of this. Would that be OK?

Tyil commented 6 years ago

I am very much against using a proprietary or 3rd party service to track all users of the site. If it were a self-hosted system (so all data doesn't get shared with Google or another service), I don't mind.

@JJ Matomo seems usable on first glance since it can be self-hosted, which would also remove the "trial period" it seems. I am currently not against it, if it will be self-hosted.

nxadm commented 6 years ago

Just anecdotical, but my adblocker (ublock origin) blocks that domain by default :).

Tyil commented 6 years ago

@nxadm One of the many blocking plugins I use in Firefox also takes it out. But I don't expect everyone to have a privacy-centric setup in their browsers. These people also shouldn't be tracked by 3rd parties by using our docs. Especially not by 3rd parties that are known to abuse this data for corporate profit.

nxadm commented 6 years ago

@Tyil Yes, on top my pihole block that as well. My post should not be read as "it's ok, it's blocked" but as "it's pretty useless because it's blocked".

JJ commented 6 years ago

So we would have to go with something self-hosted, right?

Tyil commented 6 years ago

If you want to track users, that's the way I would suggest you go, yes.

stmuk commented 6 years ago

Self hosting would be a lot more work than the cloud solutions which are generally adding a few lines of javascript into the HTML. This literally takes a few minutes rather than the few hours of self-hosting setup and probably maintenance time.

If people are offering to do the work of self hosting then that's good but not sure its worth it for a few weeks of run time.

You could offer a "opt out" link for people quite easily with Google Analytics and probably with other services like Matomo for anyone objecting to the privacy issues who isn't already running privacy enhancing brower plugins.

JJ commented 6 years ago

That would be really be helpful and something I can personally do without depending on other people and their installation policies on production servers. It would really be just temporary and it would go for the greater good of improving Perl6's documentation.

Tyil commented 6 years ago

@stmuk I actually set up Matomo to see how much effort it is. Though I did use docker for it, the effort is minimal, certainly not "a few hours".

I still maintain we should have privacy by default on the docs, not "all your privacy is void unless you find this hidden button to give you privacy from this moment onwards".

@JJ A self-hosted Matomo is easy to setup and seems to give everyone what they want. I'm voting for this solution.

AlexDaniel commented 6 years ago

To the title:

âš  No. âš 

Adblockers get rid of this stuff, and for a good reason. Let's not spit in the face of our users even if we have good intentions (a large number of them have face shields anyway).

Wouldn't webserver logs provide all the needed info? E.g. what docs pages are popular?

They might, but I don't have access to the server

Then you should just ask around on #perl6 and I'm sure you'll be provided with access.

As for self-hosted matomo… uhhhh… I don't know. Looking at their website (which I had to temporarily whitelist in my adblocker…):

Browser addons blocking the Matomo (Piwik) Tracking Javascript (NoScript, DoNotTrack, etc.) If you use browsers addons such as Adblock, Adblock plus, NoScript, Ghostery or others, the Matomo Javascript code is not executed in your browser. Try to use a different browser that does not have these extensions, or disable these browser extensions and try again.

So IMO we should not waste any effort in this fight with users. Look at apache logs (or whatever), and you'll get half of the info you want. As a user I think the other half you shouldn't have.

I still maintain we should have privacy by default on the docs, not "all your privacy is void unless you find this hidden button to give you privacy from this moment onwards".

Correct.

JJ commented 6 years ago

OK. I'll try to obtain some logs and see how I can work with them to extract the information I need. I'm closing this now.

Tyil commented 6 years ago

@JJ You can still use Matomo in that scenario. Fire up a VM, container, or whatever you like, and load up the logs into that.

But there's plenty of other ways one can parse the logs. I've heard Perl is great at parsing text ;)