Closed peterbo closed 4 years ago
This is intended. Since the referenced change the UserID no longer influences the visitor ID or visit ID.
Is this really what we want? In this case, the User-ID has no more added value than a named CustomDimension.
I think the primary use-case is the cross device recognition of users. I know many instances where the User-ID is used and it is always for this case. If you want to record a User-ID for a visit, decoupled from user recognition logic, one could simply use a custom dimension. Secondary use-case, but equally important, is to receive conversions / other actions server-side from external sites / own internal systems to measure campaign success / other KPIs for a given unique visitor/user.
Using a forced visitor-ID does by far not offer the same flexibility as the User-ID did. And speaking of semantics - User-ID, in my opinion, implies, that's a single user, using multiple devices or is tracked in different places. Therefore it must be the same visitor (in analytics terms).
Dimension definitions from my point of view:
The proposed use-case from https://github.com/matomo-org/matomo/pull/13620 This is useful for example when using the third party cookie, and thus all Matomo sites use the same "global" visitorId for the same device, and some Matomo sites set a userid.
Is not practical anymore, because 3rd party cookies are not GDPR compliant and blocked by most new browsers anyway. AFAIK, the visitor_id is queried in connection with the site-id. In my opinion, the other use cases are far more important than this one. I don't quite follow what's the point of changing this feature towards edge cases (3rd party cookies, cross site-id tracking, or a user that has 10 different accounts to log into online gambling) / case that is not relevant anymore.
ping @mattab
I'm not so much into this topic. I suppose it wouldn't help to do something like setVisitorId(hashedUserId.substr(0,16)) (pseudo code)
?
I suppose in general the idea was maybe also that you can see what a user did before logging in and after as part of the same visit? But indeed tracking cross device is getting more complicated.
Thanks for the report @peterbo!
If you want to record a User-ID for a visit, decoupled from user recognition logic, one could simply use a custom dimension.
That's a good point :thinking: In general it was on purpose to generate separate visits on each device for a same user, but in retrospect I see what you mean that it has become more like a custom dimension and not as valuable maybe.
I'm not so much into this topic. I suppose it wouldn't help to do something like setVisitorId(hashedUserId.substr(0,16)) (pseudo code)?
Yes this could help you @peterbo if you run this code in JavaScript, and then on the server-side you also get to generate the same Visitor ID. It might be the easier solution in your case. another solution be to get the Visitor ID from your visitors and store it in your DB for each visitor/user, and then set it again when tracking the conversion server-side.
I suppose in general the idea was maybe also that you can see what a user did before logging in and after as part of the same visit?
Yes, it was the idea also, and an advantage of changing the implementation...
What do you think?
in the "after" screenshot, the visit is not marked as "Returning" when a same "user id" visits twice. Expected that the Visits log shows a Returning icon and API mark the visit as a returning visitor, when it had a recent visit with the same User ID. Especially that on your "After" screenshot the 2nd visit was less than 30min after the previous one, so we would have expected it to show the returning visitor icon.
Also the FAQ needs updating as it explains the old algorithm with User id https://matomo.org/faq/general/faq_21418/ eg. If a User ID is set, either via setUserId in your favorite SDK or via &uid= in the Tracking API, this User ID will be converted (hashed) into a Visitor ID hexadecimal string. The hashed User ID becomes the Visitor ID. We look first for visits where the log_visit.idvisitor matches this Visitor ID (User ID). If no visit is matched, we look for visits where the log_visit.config_id matches the visitor fingerprint.
Hey @mattab thanks for the feedback!!
Yes this could help you @peterbo if you run this code in JavaScript, and then on the server-side you also get to generate the same Visitor ID. It might be the easier solution in your case. another solution be to get the Visitor ID from your visitors and store it in your DB for each visitor/user, and then set it again when tracking the conversion server-side.
Making it work again is not a problem. Unfortunately, it's not that easy, because we can't execute business logic on the external endpoints. But I can create a plugin that changes recognition logic. That's not really the Problem. It's rather, that a key feature changed and now can't be used natively for these modern and arising use-cases anymore (e.g. cross device / cross API).
I suppose in general the idea was maybe also that you can see what a user did before logging in and after as part of the same visit?
That's a valid point. However, this is an adjacent use-case to all other uses, and, from my understanding, can be achieved easily with simple CustomDimensions. But also in this case, it doesn't make sense (at least to me) not to recognize the unique visitor again.
in the "after" screenshot, the visit is not marked as "Returning" when a same "user id" visits twice. Expected that the Visits log shows a Returning icon and API mark the visit as a returning visitor, when it had a recent visit with the same User ID. Especially that on your "After" screenshot the 2nd visit was less than 30min after the previous one, so we would have expected it to show the returning visitor icon.
That's what I'd have expected as well. Then the feature would also work for cross device. Decoupling from Visitor-ID is not a bad idea per se, but I feel that, at the moment, the feature is drifting in between use cases and not at all easiy to understand for the average user. Perhaps would be good to have a "default" behavior which can be configured towards a use case for advanced users?
@peterbo
Perhaps would be good to have a "default" behavior which can be configured towards a use case for advanced users?
Feel free to create a separate issue with your thoughts for this :+1:
in the "after" screenshot, the visit is not marked as "Returning" when a same "user id" visits twice. Expected that the Visits log shows a Returning icon and API mark the visit as a returning visitor, when it had a recent visit with the same User ID. Especially that on your "After" screenshot the 2nd visit was less than 30min after the previous one, so we would have expected it to show the returning visitor icon.
Also the FAQ needs updating as it explains the old algorithm with User id https://matomo.org/faq/general/faq_21418/ eg. If a User ID is set, either via setUserId in your favorite SDK or via &uid= in the Tracking API, this User ID will be converted (hashed) into a Visitor ID hexadecimal string. The hashed User ID becomes the Visitor ID. We look first for visits where the log_visit.idvisitor matches this Visitor ID (User ID). If no visit is matched, we look for visits where the log_visit.config_id matches the visitor fingerprint.
@mattab what is the benefit of the current userId behaviour of a custom dimension? If there's no clear benefit, I would 100% vote to change behaviour back to original behaviour and no flags on how things work.
Expected that the Visits log shows a Returning icon and API mark the visit as a returning visitor, when it had a recent visit with the same User ID
I doubt that Matomo Core would be able to handle that (a returning visitor flag for a different visitor ID). E.g. Visitor-Log: A visit is flagged as returning and when you open the Visitor profile, you will only see one visit. This will probably also be the case for visitor based report archiving (being flagged as returning visitor but counted as two unique visitors) -> returning visitor reports will be distorted.
For really decouple visitor ID from user ID and really adding value, probably some core modifications (archiving, visitor log, etc.) would be necessary. This would be part of a new ticket.
So at the moment, in my opinion, rolling back or creating a config setting for a default behaviour would be the best options. What do you guys think?
Why revert it? If you want the old behaviour just set the visitor id manually. ... and my first pull request actually had a setting to set the userid behaviour per-site, but you refused it. Honestly I really do not want to go back to applying patches for each Matomo update. I am thinking about forking and continuing the project under a new name. Currently userid is somewhat like a custom dimension, but one that automatically creates new visits. It's not something that can be replaced by simply using a CD.
because 3rd party cookies are not GDPR compliant and blocked by most new browsers anyway. AFAIK, the visitor_id is queried in connection with the site-id.
Wrong. For example in my setup Matomo runs in its own subdomain matomo.domain.com and the matomo sites are other subdomains and paths on the same domain. In this setup the 3rd party feature works very well with all browsers and is very useful to connect the different matomo sites (over 50). So 3rd party cookies are in fact working very well. (and I also have invested a lot of time to fix all the bugs in Matomo related to them)
For really decouple visitor ID from user ID and really adding value, probably some core modifications (archiving, visitor log, etc.) would be necessary. This would be part of a new ticket
Great, so go ahead and create that patch like I did instead of asking to have the work of others removed and break their setup because of your edge case.
For example in my setup Matomo runs in its own subdomain matomo.domain.com and the matomo sites are other subdomains and paths on the same domain.
Thats probably not a 3rd party cookie but a wildcard cookie that you setup with the scope *.example.org?
Currently userid is somewhat like a custom dimension, but one that automatically creates new visits. It's not something that can be replaced by simply using a CD
Generally, this would be easily possible by adding new_visit=1 once to a request that also includes a User-ID: '_paq.push(['appendToTrackingUrl', 'new_visit=1']);' - but I'd rather like to solve this for both use cases.
Great, so go ahead and create that patch like I did instead of asking to have the work of others removed and break their setup because of your edge case.
That's why we're here. To discuss options and added value, not blindly execute. Hence, it would be great if you would contribute in the discussion of use cases and how we could create a feature that is good for different use cases and not break 50% with a minor update.
Thats probably not a 3rd party cookie but a wildcard cookie that you setup with the scope *.example.org?
Technically yes, but it works using Matomo's "3rd Party Cookie" feature.
Generally, this would be easily possible by adding new_visit=1 once to a request that also includes a User-ID: '_paq.push(['appendToTrackingUrl', 'new_visit=1']);' - but I'd rather like to solve this for both use cases.
If one is using the official Matomo JS API yes, but my 50+ Matomo sites are managed by many different teams, some using their own API, some using Pixels, etc, etc,. It would be a big pain and take a lot of migration time to move to this new way to do it.
That's why we're here. To discuss options and added value, not blindly execute. Hence, it would be great if you would contribute in the discussion of use cases and how we could create a feature that is good for different use cases and not break 50% with a minor update.
Yes, that is why I am here.
Basically having a per-site setting to switch between the two userid behaviors would be totally fine for me. Actually my first pull request had such a setting and even defaulted to the old behavior. I just do not want to be forced to do it the old way.
It would be a big pain and take a lot of migration time to move to this new way to do it.
Well, that's exactly the situation, I (and probably others) find myself in now. I also service a lot of instances with around 10k Sites. Just a few dozen of them are using the User-ID feature, but all of them rely on a recognition by User ID over visitor ID. So you could imagine the pain and work that has to be done.
Great, so go ahead and create that patch like I did instead of asking to have the work of others removed and break their setup because of your edge case.
Another comment to that statement. This not any edge case but the reason to introduce the User-ID feature in the first place. So generally, it'd be good to keep it stable, especially within minor version updates. But that's something, we already all agree on, so lets look ahead towards the resolution.
I'd be fine to make this a config setting - @tsteur @mattab what do you think about that?
we'll need to think more about it. Might take a while. @peterbo I'm not sure about config setting. it would be better to find the optimal solution that fits most use cases. Maybe we can make (almost) everyone happy with a few tweaks to bring back the usefulness of User ID.
@mattab
I'm trying to understand the thoughts here. What is to your opinion now the difference between userId and a custom dimension? And why was it changed?
It seems to be 99% of users likely don't use 3rd party cookies and it was made worse for them but maybe I'm missing something.
@MichaelHeerklotz
Currently userid is somewhat like a custom dimension, but one that automatically creates new visits.
I'm actually not sure we're doing that currently, or are you saying it should? Really just trying to understand things here. I don't really understand yet why the current behaviour is better for 3rd party cookies and why it was previously not good. Can any of this behaviour maybe achieved with a plugin?
Wrong. For example in my setup Matomo runs in its own subdomain matomo.domain.com and the matomo sites are other subdomains and paths on the same domain. In this setup the 3rd party feature works very well with all browsers and is very useful to connect the different matomo sites (over 50).
In that case, would you be able to use the setCookieDomain
and set the domain to .your-domain.com
so the cookie is 1st party yet readable on all subdomains?
Then I see that Peter suggests the same and you reply "Technically yes, but it works using Matomo's "3rd Party Cookie" feature." which does not make sense to me? Why use 3rd party cookie if 1st party would work? Probably 3rd party is only needed when you want to do cross-domain analysis, i suppose...
@mattab what is the benefit of the current userId behaviour of a custom dimension? If there's no clear benefit, I would 100% vote to change behaviour back to original behaviour and no flags on how things work.
I guess the benefit is that, a visit on mobile will appear separately from a visit on desktop. Before the change, the interactions across mobile and desktop visits were merged into one. Whether it's a benefit is not clear however... as Peter points out (and a few other people by email) it's complex to update Mobile Apps and other SDKs to set the proper Visitor ID based on the web visit (or as a hash of User ID) etc.
Would reverting this be as simple as reverting this PR? https://github.com/matomo-org/matomo/commit/ea5a14bdf8aa9608cdc2ab7d5c8236a5ff1eb3e2
Could we maybe assign this to 3.13.4?
@mattab there were also few other follow up PRs and also in PHP SDK etc. Not too many I think.
I guess the benefit is that, a visit on mobile will appear separately from a visit on desktop. Before the change, the interactions across mobile and desktop visits were merged into one.
I seriously thought that those merged across devices into one visit (cross device tracking) was the purpose of userId.
Re 3.13.4 depends. Would maybe need to go in a 3.13.5 if needed
I seriously thought that those merged across devices into one visit (cross device tracking) was the purpose of userId.
:+1:
Wrong. For example in my setup Matomo runs in its own subdomain matomo.domain.com and the matomo sites are other subdomains and paths on the same domain. In this setup the 3rd party feature works very well with all browsers and is very useful to connect the different matomo sites (over 50).
In that case, would you be able to use the
setCookieDomain
and set the domain to.your-domain.com
so the cookie is 1st party yet readable on all subdomains? Then I see that Peter suggests the same and you reply "Technically yes, but it works using Matomo's "3rd Party Cookie" feature." which does not make sense to me? Why use 3rd party cookie if 1st party would work? Probably 3rd party is only needed when you want to do cross-domain analysis, i suppose...
Different sites use different first party cookies, how would that help? How would that cause different sites to use the same visitor id?
Note: I have deleted some comments I made after this post, because I went too far with them. However, I really would prefer a professional handling of this issue. We could revert the changes and add a setting for it afterwards if you want to fix the issue asap.
Reverting a change that took months to get merged and telling me to use "setCookieDomain" which does not help at all is, let us say... a bit harsh.
In any case, we should keep the fix that avoids overwriting the global visitor id (_pk_uid) with the user id. If not, if any site messes up the setUserId() call (for example giving every logged out user the same id), it will break the whole Matomo setup for all sites.
A compromise could be to generate the visitor id from the user id, but to have multiple visits for each device. What do you think? @mattab @tsteur @peterbo
However, this still creates the problem, that it basically breaks any per-device tracking. How could one see what was done before / after loggin in or out?
I really feel we should have a setting for this.
Hi everyone, I'd like to heat up the discussion again for this topic, since a lot of instances can not be updated at the moment. @tsteur did you already make a decision how we should proceed with this issue?
I have not really any preference as I'm not so much in the topic. But wondering:
By default, I reckon more users would want the userId and visitor linked I suppose and use it for device tracking. Other SDKs would need to implement a similar behaviour. Not sure if that would work though. Alternatively, we could add a setting to the whole thing in the backend instead of in the tracking SDK (might make more sense)
I agree with @tsteur .
If we change it in the trackers/sdks this will be more work and maybe a bit confusing for the users, because the behavior will depend on sdk version and not Matomo version. In that cause I would like to configure the setting for the JS/matomo.js SDK with the Matomo backend so that my (dev)users do not have to update their webpages/javascript implementations. I would be willing to create a patch for that feature.
On the other hand, if we change it in the Matomo Backend/Core, it would be less work and easier to understand for the user. In that case I would have less/no work for additional patches.
For me both solutions are okay, as long as I can set the default behavior globally in Matomo Core. So I am happy in any case :)
+1 for changing in core. Tracker behaviour should not be made more complex in my opinion.
I think that using the userID field without any added value over the Custom Dimensions is still a confusing approach, but restoring the original behavior and being able to activate the alternative functionality of the userId could be a good compromise.
Sounds good to have an option in the backend.
FYI, to make our FAQ accurate at: https://matomo.org/faq/general/faq_21418/, changed it from:
- If a [User ID][7] is set, either via
setUserId
in your favorite SDK or via&uid=
in the Tracking API, this User ID will be converted (hashed) into a Visitor ID hexadecimal string. The hashed User ID becomes the Visitor ID. We look first for visits where thelog_visit.idvisitor
matches this Visitor ID (User ID). If no visit is matched, we look for visits where thelog_visit.config_id
matches the visitor fingerprint.
to:
- If a [User ID][7] is set, either via
setUserId
in your favorite SDK or via&uid=
in the Tracking API, then we will look first for visits where thelog_visit.idvisitor
matches this Visitor ID. If no visit is matched, we look for visits where either thelog_visit.user_id
matches the User ID, or wherelog_visit.config_id
matches the visitor fingerprint.
according to code in: https://github.com/matomo-org/matomo/blob/3.13.5/core/Tracker/Model.php#L397-L414
@mattab The documentation does not reflect the current behavior:
If no visit is matched, we look for visits where either the log_visit.user_id matches the User ID, or where log_visit.config_id matches the visitor fingerprint
That's not the case: A new visit is created and gets the same User-ID as the other visitor, exactly like a CustomDimension:
Both visits have the same User-ID and are no returning visitors (-> no visitorId or configId recognition). This use-case is an Android app with a webview which was held together with the forced User-ID before the changes.
Is this being worked on? If not, I'll create a PR.
I just want to add support for @peterbo's arguments here. I've had a long discussion with support from Matomo as well on this, and currently the UserID tracking simply isn't working like most people expect. We have a SPA where we set the UserID explicitly after logging in. When we ask for reports on time usage inside our app inside the Matomo web interface, where we select UserID as the first dimension, it simply does not work. The report is mostly identical with the same report using VisitorID (which we do not set currently at least). And worse, as far as I understand, getting reports grouped by UserID like I've described is not even possible with how it currently works. But maybe somebody here has better ideas, if so I'm all ears (I guess if we control the VisitorID client side we could do something, but that seems kind of counter intuitive, it feels like this is exactly what UserID should be used for).
@mattab are we changing this in 3.X too maybe? It seems like quite a broken feature now the user ID
@mattab The documentation does not reflect the current behavior: Both visits have the same User-ID and are no returning visitors (-> no visitorId or configId recognition). This use-case is an Android app with a webview which was held together with the forced User-ID before the changes.
I have the same problem. I have a brand new installation of Matomo. I switched Cookies off completely and want to measure returning visitors with device fingerprint and UserId only.
When I click on the tab "Visits/UserIds" I can see some users logging in several time, because in column "visits" there is "2" instead of "1". When I click "Visits/Overview", total visits and unique visitors are exactly the same = unique visitors are not detected.
For a new Matomo user, this looks like a bug, not a feature ;)
One more note: I think, detecting unique visitors is maybe the most important thing to make all statistics meaningful, especially for measuring the success of campaigns and goals. This is why I thought: Ok, I don't use cookies, but I put as many information into the tracking code to help Matomo to detect unique visitors. This is, why I included the uid.
Question to @tsteur: you wrote an example to fix it temporarily :
setVisitorId(hashedUserId.substr(0,16))
Is there a best practice how to calculate a visitor ID including the userId?
Thx,
Andreas
@fcandi I haven't tested it but setVisitorId(sha1(userId).substr(0,16))
should basically work as this is what core used to do. In JavaScript you won't have sha1 available unless you add it AFAIK so I reckon any other kind of hashing will work just as well as long as you end up with a 16 character hex value. If you can use sha1()
it basically only means that it will end up generating the same visitorId as it used to do a few months ago and therefore the same user would get the same visitorId as before.
If you can use
sha1()
it basically only means that it will end up generating the same visitorId as it used to do a few months ago and therefore the same user would get the same visitorId as before.
Thx @tsteur for your explanation. I control the backend so its would be simple to generate the sha1 on the server while loading the user. But I have one more question for understanding:
When I send UserId and VisitorId in the tracking code for logged in users, what happens to new user that are registering during their visit: Does Matomo still show the time before and after registration as one visit? Because this is important for measuring the conversion rate.
Does Matomo still show the time before and after registration as one visit?
Yes I would say so.
@fcandi I haven't tested it but
setVisitorId(sha1(userId).substr(0,16))
should basically work as this is what core used to do. In JavaScript you won't have sha1 available unless you add it AFAIK so I reckon any other kind of hashing will work just as well as long as you end up with a 16 character hex value. If you can usesha1()
it basically only means that it will end up generating the same visitorId as it used to do a few months ago and therefore the same user would get the same visitorId as before.
Any idea how the call to setVisitorId
would look? I tried this:
window._paq.push(['setUserId', email]);
window._paq.push(['setVisitorId', sha1(email).substr(0, 16)]);
Which did not work and throws an exception about the setVisitorId
method not being found inside the javascript client (I included the sha1
function with a library, so that's not the problem).
Sorry I did not realise this method is only available in development mode (in tests). I was actually planning on using this method myself so I created this PR: https://github.com/matomo-org/matomo/pull/16042
It will be available in the next release. If you need the file earlier you could patch your tracker file: https://raw.githubusercontent.com/matomo-org/matomo/exposesetvisitorid/matomo.js
Thanks for the clarification. We're currently on a paid hosting plan. But this bug and a couple of other issues (indicating we need to upgrade our plan to change even simple settings) is making me strongly considering self hosting instead.
Just BTW on our Cloud you won't need to wait for the next Matomo release which might be a month or two away but there you can expect this to be deployed at the latest end next week
Thanks, useful information, fingers crossed.
@mariusk just wanted to let you know that you can expect this change to become active on Monday.
Great, thanks, I'm assuming you mean the code to modify the visitorId
should then work. I will try to reactivate it after Monday then, or as soon as I get notice that it should be live.
Yes the JS tracker that allows you to set the visitorId
Just fyi we deployed this yesterday @mariusk let me know if it's not working for you and I can follow up
@tsteur Thanks. I've re-deployed my updated user tracking code and this time around at least it works (doesn't crash the client). Fingers crossed!
@tsteur Another possible improvement related to this would be to avoid throwing an error when the visitorId
or similar doesn't exist. Today we're getting hit by people having the old Matomo client from CDN with our new release which sets visitorId
directly as discussed. Since the "function call" is indirect (pushing to a list), I'm not sure wrapping the visitorId
setting in a throw/catch should solve that issue, but feel free to enlighten me if you think it should.
@mariusk Hey Marius, it'd be great if you could clarify this topic via the forum / support ticket, because this is no more related to the ticket and quite a bunch of people are getting notified of new messages here.
@peterbo Should be fine. Anybody following this ticket and attempting the same workaround will get smacked by the Matomo client throwing an error about the missing function, as reported (and confirmed) earlier. But only until everybody gets the updated Matomo client. I'll leave it for now and fingers crossed, it should all be good within a short time.
I'm setting a User-ID, when a user visits a given Site. On a certain action, I'm triggering a goal serverside with his User-ID as a parameter (and a token). Effect after the Update from 3.12 to 3.13.2 is that the serverside triggered action is not only stored in another visit (which would be ok), but also fails to recognize the web-visitor with the same User-ID. For reference, the before/after screenshots:
Before (Visitor recognized -> new visit but returning visitor):
After (Visitor is not recognized -> Visit is not returning -> new visit and new visitor):
Serverside call: https://example.com/piwik.php?token_auth=XXX&cdt=2019-08-07 18:56:10&idgoal=3&revenue=1234&idsite=X&rec=1&r=13454&uid=1234567890
In config, trust_visitor_cookie is disabled.
The reason for that is this change: https://github.com/matomo-org/matomo/commit/ea5a14bdf8aa9608cdc2ab7d5c8236a5ff1eb3e2#diff-6700aaf1ce500fe51e284b9ec6f01b01
The change works in the right direction, but now, a User-ID is only assigned to the same visitor, when also the config_id matches. This doesn't make sense, because the main use case is for example a user who logs into a website with different devices (GDPR aside, but the User-ID is for example the customer ID). This user should be recognized as the same visitor (not necessarily the same visit, but at least the same visitor). @MichaelHeerklotz
Refs https://github.com/matomo-org/matomo/pull/14360