Empowering People Ethically with the leading open source alternative to Google Analytics that gives you full control over your data. Matomo lets you easily collect data from websites & apps and visualise this data and extract insights. Privacy is built-in. Liberating Web Analytics. Star us on Github? +1. And we love Pull Requests!
Currently, conversions are attributed (to a referer or campaign) by two means:
By piwik.js storing campaign name & keyword in a cookie and passing it to _rcn or _rck in subsequent visits
By the server side extracting referer information from the current visit
This causes various limitations in accurately tracking attributions. If a user converts in a 2nd or later visit (where URL referer/campaign information is no longer available) and the cookie is unavailable, attribution for that conversion is lost. For example:
Thanks to campaign X, a user initially visits the site on their cellphone
By setting a user-id, the user is tracked across multiple devices
One day later, the user visits on their tablet and converts
The conversion is now attributed to "direct entry" instead of campaign X.
The same thing happens:
When the user has cleared cookies between step 1 & 3
When the conversion is tracked by an external system instead of through piwik.js
Other limitations are:
Dimensions other than campaign name & keyword (such as source & medium that are added by the campaign plugin) are currently not stored in the cookie, so they are always lost between visits
The architecture can only deal with a single source, so multi-channel attribution (#6064) is very hard to implement on top of it
I propose that cross-visit goal attribution is handled on the server side instead. When a conversion is created, instead of fetching dimensions from the current visit, we should simply look in all of the visitor's visits and using the last (or first) visit with non-empty attributes. setConversionAttributionFirstReferrer would move from piwik.js to a server-side configuration option.
As far as I can see this would solve all problems, without significant performance issues:
explain select * from piwik_log_visit where idvisitor='abc' and (campaign_content is not null or campaign_id is not null or campaign_keyword is not null or campaign_medium is not null or campaign_name is not null or campaign_source is not null) order by idvisit desc limit 1;
+----+-------------+-----------------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
| 1 | SIMPLE | piwik_log_visit | NULL | index | NULL | PRIMARY | 8 | NULL | 1 | 16.67 | Using where |
+----+-------------+-----------------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
Perhaps the only caveat would be installations where the raw visitor logs are deleted after X days, so attribution wouldn't be possible if the conversion happens X (180 by default) days after the initial visit. I don't think that's an issue, but if it is we could keep the cookie attribution information as a fallback.
Currently, conversions are attributed (to a referer or campaign) by two means:
This causes various limitations in accurately tracking attributions. If a user converts in a 2nd or later visit (where URL referer/campaign information is no longer available) and the cookie is unavailable, attribution for that conversion is lost. For example:
The conversion is now attributed to "direct entry" instead of campaign X.
The same thing happens:
Other limitations are:
I propose that cross-visit goal attribution is handled on the server side instead. When a conversion is created, instead of fetching dimensions from the current visit, we should simply look in all of the visitor's visits and using the last (or first) visit with non-empty attributes.
setConversionAttributionFirstReferrer
would move from piwik.js to a server-side configuration option.As far as I can see this would solve all problems, without significant performance issues:
Perhaps the only caveat would be installations where the raw visitor logs are deleted after X days, so attribution wouldn't be possible if the conversion happens X (180 by default) days after the initial visit. I don't think that's an issue, but if it is we could keep the cookie attribution information as a fallback.