matomo-org / matomo

Empowering People Ethically with the leading open source alternative to Google Analytics that gives you full control over your data. Matomo lets you easily collect data from websites & apps and visualise this data and extract insights. Privacy is built-in. Liberating Web Analytics. Star us on Github? +1. And we love Pull Requests!
https://matomo.org/
GNU General Public License v3.0
19.82k stars 2.64k forks source link

When enabeling third party cookie tracking, visitor ID is the same across measurables. #15733

Open jjcitron opened 4 years ago

jjcitron commented 4 years ago

we enabled third party cookies to help enhance our tracking and let us better track cross domain for some clients.

in production, we were testing a few sites, and when visiting the site, on teh fornt end we wee seeing a user with a visitor ID of "ABC" for example. this value was the value passed to matomo.php.

however when going into the UI, we are unable to find this visitor ID.

we were able to track down the user activity, (by IP) and we saw a different visitor ID...

Searching for that ID using the GDPR tools, we were able to fnd it, and it was spanning multiple sites and measurables?

This seems like a bug?

Cross measurable fingerprinting is not enabled, so it doesnt make sense that the visitor ID would span sites.

tsteur commented 4 years ago

Looking at the code this be expected currently. It is also described in https://matomo.org/faq/how-to/faq_118/ that while we don't report cross sites, you can ran raw data queries to get this data should you need it (or use Roll Up Reporting).

I suppose it could be considered a bug as enable_fingerprinting_across_websites is disabled by default. Changing this behaviour would break existing visitor IDs so that's not something we could only do in a major release like Matomo 4.0 or 5.0 and we would need to announce this transparent that 3rd party cookie users would want to enable enable_fingerprinting_across_websites should they want to keep the same visitorIds.

I suppose the main reason for third party cookie is tracking users across websites which is why it works like this currently I suppose.

Do you mind me asking why you are using third party cookies? I suppose you would actually want this behaviour as well that the same ID is used across sites but you are reporting it anyway?

jjcitron commented 4 years ago

The reason was to keep the same user ID across sites, but in the same measurable. for example I have a client that has a website, client.com. Their checkout is on client.force.com for example. now usually we would add a linker, but at times, the user may not travel directly from their site to the checkout site. Additionally, some of these third party sites hve a "break" in the tracking, not allowing us to use the linker. Using the third party cookie, does accomplish this, but other than making the visitor ID remain the same across measurables, it caused the visitor ID reported by _paq.push([ function() { this.getVisitorId(); }]); (or the _pk_id. cookie) to be different from the ID that matomo is showing in the reports. this is actually the largest concern, as we are using the visitor ID to tie back offline conversions to website activity (for example a phone call, or a slaesforce form submission to sale), and now the ID sent to the third party platform no longer matches what is in matomo, and even searching for that ID that was showing on the front end yeilds no results.

tsteur commented 4 years ago

The reason was to keep the same user ID across sites

Just asking to make sure I understand things right. When you say "user ID" you mean Matomo's actual user ID feature where you are setting a user ID using the setUserId JS tracking method or do you mean the visitor ID?

Using the third party cookie, does accomplish this, but other than making the visitor ID remain the same across measurables, it caused the visitor ID reported by _paq.push([ function() { this.getVisitorId(); }]); (or the _pk_id. cookie) to be different from the ID that matomo is showing in the reports. this is actually the largest concern, as we are using the visitor ID to tie back offline conversions to website activity (for example a phone call, or a slaesforce form submission to sale), and now the ID sent to the third party platform no longer matches what is in matomo, and even searching for that ID that was showing on the front end yeilds no results.

Not sure I understand things right. Are you saying you expect userId to be the same as visitorId? If so, there is some discussion going on around this in https://github.com/matomo-org/matomo/issues/15593

I'm asking like below quote sounds like a different issue than visitor ID being the same across measurables?

it caused the visitor ID reported by _paq.push([ function() { this.getVisitorId(); }]); (or the _pk_id. cookie) to be different from the ID that matomo is showing in the reports.

If all sites are tracked into the same measurable, then it would be actually expected that they all have the same visitorId as it is not actually across measurables/sites. Looking into the code again, if there were different measurables used with third party cookies, then it looks to me like it would actually use different visitorIds for different measurables.

jjcitron commented 4 years ago

The reason was to keep the same user ID across sites

Just asking to make sure I understand things right. When you say "user ID" you mean Matomo's actual user ID feature where you are setting a user ID using the setUserId JS tracking method or do you mean the visitor ID?

I Meant the Visitor ID

Using the third party cookie, does accomplish this, but other than making the visitor ID remain the same across measurables, it caused the visitor ID reported by _paq.push([ function() { this.getVisitorId(); }]); (or the _pk_id. cookie) to be different from the ID that matomo is showing in the reports. this is actually the largest concern, as we are using the visitor ID to tie back offline conversions to website activity (for example a phone call, or a slaesforce form submission to sale), and now the ID sent to the third party platform no longer matches what is in matomo, and even searching for that ID that was showing on the front end yeilds no results.

Not sure I understand things right. Are you saying you expect userId to be the same as visitorId? If so, there is some discussion going on around this in #15593

I'm asking like below quote sounds like a different issue than visitor ID being the same across measurables?

it caused the visitor ID reported by _paq.push([ function() { this.getVisitorId(); }]); (or the _pk_id. cookie) to be different from the ID that matomo is showing in the reports.

I aplogize for the confusion. when enabeling third party cookies, we saw the same visitor ID across multible websites in different measurables

If all sites are tracked into the same measurable, then it would be actually expected that they all have the same visitorId as it is not actually across measurables/sites. Looking into the code again, if there were different measurables used with third party cookies, then it looks to me like it would actually use different visitorIds for different measurables.

unfortunately, this is not the case, we are seeing users being reported with the same visitor ID across different measurables, the intented use was as you described above. tracking a user across multible domains under the same mneasurable using third party cookes (for the browsers that still support it - like chrome) what happened was, that we saw one visitor ID reported on the website, and we could not find in Matomo. then when we searched using the GDPR tools via IP, we saw the same visitor ID reported against a single user (for example myself ),

tsteur commented 4 years ago

what happened was, that we saw one visitor ID reported on the website, and we could not find in Matomo.

You mean you saw this reported eg in the developer logs like by calling _paq.push([ function() { this.getVisitorId(); }]);? This can be expected that it is different to the ID in Matomo if at some point all cookies were deleted. Then the user could be matched to a different visitorId server side if an older visitorId was found based on the fingerprint. The client side visitor ID be ignored in that case. Not sure if that's what you mean just wanting to make sure we're meaning the same things.

Basically