simonfrey / matomo_circumvent_adblock

Matomo/Piwik anti adblock php script which help you to circumvent the blocking of your tracking by ad blockers
MIT License
33 stars 4 forks source link

Thought #2

Open phalox opened 3 years ago

phalox commented 3 years ago

Hey Simon, nice job!

I didn't use your script yet, but I did see that I'm missing out on tracked visits. And these users already visit my website (where the logs keep track of it...). So of course I want to include them too.

Did you ever consider if it's possible to wrap the existing scripts in another one, so that you don't need any generation step?

simonfrey commented 3 years ago

Yes I considered that, but by doing the generation step I hope to be somewhat future proof with the script if mastodon changes their logics in the script. (Always shipping an update would be to much work that I do not plan to put into this project :))

phalox commented 3 years ago

I had another look at this...

I don't think you need all this hacking, but I might be missing something.

Create a .htaccess file in the matomo room with this

RewriteEngine On    # Turn on the rewriting engine
RewriteRule  m.js$  matomo.js [PT]
RewriteRule  m.php$  matomo.php [PT]

change manually in the tracking snippet:

_paq.push(['setTrackerUrl', u+'m.php']);

and

var d=document, g=d.createElement('script'), s=d.getElementsByTagName('script')[0];
g.type='text/javascript'; g.async=true; g.src=u+'m.js'; s.parentNode.insertBefore(g,s);

Works for me, at least in my setup. But maybe I'm missing something.

simonfrey commented 3 years ago

Did you test that with an ad blocker? Ublock origin blockes matomo also based on the query parameters which are not changed in your approach, but with the php generation script.

phalox commented 3 years ago

You're right! I just did a proper test (ublock origin wasn't installed in my 2nd browser) and indeed it blocks it, I assume based on this rule:

.php?action_name=

Seems like something that will have a lot of false positives. I'd use action_name for regular functionality too.

Other things related to matomo (not sure which other params would be blocked?):

EasyPrivacy:

/matomo-tracking.
/matomo.js$domain=~github.com
/matomo.php
/matomo/*$domain=~github.com|~matomo.org|~wordpress.org
/piwik-$domain=~github.com|~matomo.org|~piwik.org|~piwik.pro|~piwikpro.de
/piwik.$image,script,domain=~matomo.org|~piwik.org|~piwik.pro|~piwikpro.de
/piwik.*/ping?
/piwik.js
/piwik.php
/piwik/*$domain=~github.com|~matomo.org|~piwik.org|~piwik.pro
/piwik1.
/piwik2.js
/piwik_
/piwikapi.js
/piwikC_
/piwikTracker.
://piwik.$domain=~matomo.org|~piwik.pro
||matomo.cloud^$third-party
@@||plugins.matomo.org^$image,~third-party

Would be good to document this! SEO-wise you'll have people finding your repo based on the things ublock origins and their easyprivacy list is all blocking.

phalox commented 3 years ago

Yeah, I think you're right in your approach. Not much else to do, because the javascript code has to be modified in any case. The only thing that might be possible is to do a mod_rewrite for the php file (and parameters) (https://github.com/simonfrey/matomo_circumvent_adblock/blob/master/mm.php), but then again.. do you win anything by this?

I wonder if this thing could be implemented as a plugin? I also see quite some value in a server side (your own hosting) proxy script; Where all the mangling/obfuscation happens between client & your own website and is then forwarded cleanly towards matomo. The wordpress plugin already implements something of the kind. Seeing how more and more browsers/ad blockers are getting more aggressive towards privacy, I think it's the only way to still get some insight in your visitors.

simonfrey commented 3 years ago

Would be good to document this! SEO-wise you'll have people finding your repo based on the things ublock origins and their easyprivacy list is all blocking.

Nice idea. Could you take care about that and open a pul request?

I wonder if this thing could be implemented as a plugin?

Quite sure that would be possible, but the matomo plugin interface is quite bad documented as of why I went for this easier approach :D