Open bnomei opened 1 year ago
There is an undocumented arnoson.kirby-stats.debug
option which can be enabled and will log the useragent/path to get more information on where the bot detection is failing.
for my pageview counter plugin i used a tracking pixel below the first render fold. not sure if you wanna go that way.
Great solution, I will look into this (: I still like the simplicity of just using routes and, at least in my portofilio webiste, I have a lot of sub-pages that don't scroll at all, I have to think about how I could handle this
Similar to the pixel below the first render fold, there is another technique to filter bots by looking for user interaction, e.g. by using a png or svg on body:hover
: https://herman.bearblog.dev/how-bear-does-analytics-with-css/
(Still needs additional style element added on each page, though)
Looks great @grommasdietz! I definitely think it needs some sort of client side js/css logic for bot filtering. Maybe the first step would be to create an tracking endpoint in this plugin to test these methods.
One thing I just realized though, is that ublock origin blocks the tracking endpoint of the bearblog website. Im not sure if this is because it is included in a block-list or because of some rule based and the naming of the endpoint (including hit
, ref
, ...)
Just checked and it is because the hit endpoint of bearblog is blocked by https://easylist.to/
I'm currently experimenting with an api endpoint for tracking and it seems CSS only doesn't work. This is because I don't want to hash/save any IP data and instead use the referrer to determin wether something is a visit or just a view (internal navigation inside the website). With the current route hook approach I can read the referrer, but when using an enpoint I would have to send any information I need. Right now I'm thinking about something like this as a start:
const isReload = performance.navigation.type === 1
if (!isReload) {
const data = new FormData()
data.append('path', location.pathname)
data.append('referrer', document.referrer)
navigator.sendBeacon('/stats/handle', data)
}
Additionally we could only trigger the endpoint if a certain event happens or after a timeout of say, 5sec. Goatcounters count.js might be a helpful resource.
I’m definitely not into best practices in this topic and don’t have insights as you have: While I prefer a way of handling statistics without additional images and css or js, shouldn’t it still be possible to trigger any php logic by returning the image with a simple root?
'routes' => [
[
'pattern' => 'statistics/(:all).svg',
'action' => function ($all) {
$path = $all == '' ? option('home', 'home') : $all;
$page = page($path);
if (!$page) {
return page('error');
}
// Handle necessary plugin logic
$content = '<svg xmlns="http://www.w3.org/2000/svg" width="1" height="1"></svg>';
return new Response($content, 'image/svg+xml');
},
],
],
The kirby snippet could look like:
<style>
body:hover {
border-image: url("/statistics<?= Url::short($page->url()) ?>.svg");
}
</style>
The problem with this is that we loose the referrer and therefore can't distinguish between a view and a visit. Most analytic tools I know of use the hashed IP address instead to do this. We could send the referrer with php:
<style>
body:hover {
border-image: url("/statistics<?= Url::short($page->url()) ?>/<?= $_SERVER['HTTP_REFERER'] ?>.svg");
}
</style>
but this won't work with caching. So I guess it is either sending the referrer with js oder use another method to distinguish views/visit. But maybe I'm missing something
Ah okay! So even when adding a random hash on each page load to the border image to avoid caching, the html still gets cached and the image/navigation won’t be recognised?
Just hopped on to the discussion after finding out about the technique used by bearblog. I’m sure you’ll find a good way to improve the plugin logic.
Thanks for your work, looking into the ideas and explaining your considerations!
Yes, I meant the kirby html cache. If it is enabled the referrer part <?= $_SERVER['HTTP_REFERER'] ?>
in my version of your svg example will also be html-cached and therefore a stale referrer will be sent to the route. So yeah, maybe a super simple script is the best option. This would also allow to add some additional logic to filter bots in the future. Thanks for your input and interest in this plugin :) It motivates me to continue the development now that other people want to use it too!
Looks promising, just to get it: We have to call the scripts function each time we load a page, like on an ajax request, right? I think the removeEventListeners function has to be slightly corrected:
const removeEventListeners = () =>
events.forEach((e) => document.removeEventListener(e, sendStats, eventOptions))
Not sure if necessary, but it’s more safe to include the eventOptions on removal as well:
It's worth noting that some browser releases have been inconsistent on this, and unless you have specific reasons otherwise, it's probably wise to use the same values used for the call to addEventListener() when calling removeEventListener().
Created a pull request!
Most of the bots should be ignored, it uses both Matomo's DeviceDetector and Jaybizzle's CrawlerDetect. I run it on my portfolio site for testing and get some obvious bots never the less. Do you have any idea on how to improve this?