matomo-org / matomo

Empowering People Ethically with the leading open source alternative to Google Analytics that gives you full control over your data. Matomo lets you easily collect data from websites & apps and visualise this data and extract insights. Privacy is built-in. Liberating Web Analytics. Star us on Github? +1. And we love Pull Requests!
https://matomo.org/
GNU General Public License v3.0
19.6k stars 2.61k forks source link

Understand why conversions are sometimes attributed to "Direct entry" or the payment provider, instead of the current channel #18612

Closed tsteur closed 2 years ago

tsteur commented 2 years ago

from https://forum.matomo.org/t/referral-exclusions/33582/25

The resolution suggested by Matomo in this article https://matomo.org/faq/how-to/how-do-i-add-a-referral-exclusion-in-matomo/ does not really address the underlying issue - it merely recategorizes the referrer from “website” to “direct entry”, which in reality isn’t the true referrer for the user’s visit.

Here is an example of what is happening.

  1. User clicks on an email link to visit the website.
  2. User browses website and adds items to basket (at this stage referrer is “Email campaign”)
  3. User completes the order but and website redirects to 3rd party payment provider (paypal/amazon etc)
  4. User returns to website (but now the referrer for the visit is reset from “Email campaign” to “website” or if you have set the referrer exclusions it will be set to “direct entry”. What we need is for it to remain as “Email Campaign”.

As a result we are over attributing the channel “direct entry” greatly (40% vs 15% on GA).

I'm not sure if we actually should call this feature "Referrer exclusion" as it's more like an "ignore refrerrer" or "keep the original referrer"? We wouldn't exclude this tracking request. We would still track that referrer but re-use the original referrer from the same visit. If it's from a different visit then we would use again a direct visit?

sgiehl commented 2 years ago

After having a closer look at the code and building a test matrix with over 400 potential combinations, I have written tests to cover all cases I could think of. This includes the configuration options for creating new visits when the website or campaign changes, as well as having a referrer's url in the site urls and using attribution cookies that might be placed and updated when the referrer changes. As having a look at the tests won't make much sense for non-techies I now tried to build some flow charts to visualize how the attribution in Matomo works based on the test results.


Attribution of visits and conversions

The following chart will describe how the attribution in Matomo it self takes place: Referrer Attribution


Handling of Attribution Cookie

In the part above there is a check if a valid attribution cookie is being sent. This is handled by the javascript tracker. The chart below describes when a cookie will be set and/or updated. Referrer Attribution Cookie


I hope this helps everyone to understand how the attribution process works. It anything is unclear or there are any related questions, let me know.

Regarding the original issue that conversions might be attributed to the wrong channel, when returning from a payment provider. In case the payment provider is added to the site urls and also added in javascript using setDomains, the payment provider should not occur anywhere in the referrers and the visits and conversions should be attributed correctly. Guess the setDomains part is missing in our FAQ and might need to be added.


Additionally for us cloud users we need to be able to access this from the API, so @sgiehl perhaps you are correct that the code is working as expected (and is stored somewhere in the database but could you confirm how I can access this "initial referrer for each visit" from the API so that I can use this information in our reporting?

Currently the visit details return by the API only contain the referrer attributed to the visit. The conversions might though be attributed to another referrer. I will tomorrow start setting up a PR to add the referrer to the conversion details returned in a visit. So that information will become visible. Maybe we could even show that in the visitor log (at least if the referrer differs from the visit)

sgiehl commented 2 years ago

I might have found a small issue, which might affect the visit attribution while writing more tests for cases where attribution cookies are used. The flow chart above also was incorrect for that case, as it doesn't took the cookie into account for visit attribution, which is actually done, but only if the cookie attribution is for a campaign. I've updated the chart. Hope it now reflects the current behaviour.

sgiehl commented 2 years ago

Does anyone have some feedback on the flow charts above? Those charts should document the current attribution behavior. If there is anything unexpected, let us know, so we can discuss possible changes.

ts1985 commented 2 years ago

I had presented here several times the problem, as it was also named by others here and in other forums, etc.. As a result, we now have a float chat and the statement that everything works as expected. Accordingly, it is not acknowledged that there is a bug or just a design error here. The result is that Matomo is "useless" in this sense and people then have to switch to Google Analytics or other software after all.

The float chart is very technical. That you can probably set / maintain something in various places / must or whatever, is not a solution from my point of view. Apart from the fact that there is no interface for this in the admin. Who uses Matomo expects that it works. But it does not.

Again:

If this state is still ok from your point of view, then this is either a serious design error or a bug.

The fact that you generally have a problem assigning visitors correctly can be seen in the problem mentioned here (and in other places) that an extremely high number of visitors are assigned to "direct entry". If you compare this with alternatives like Google Analytics, significantly more visitors are assigned to correct sources.

So there are several serious problems here that may be related. This should be recognized and fixed. So I don't need a flow chart to understand that everything works. That it does not work correctly has been described several times in various places by different people.

Sorry if that may come across as a bit rude. But when you recognize a problem, then see that others report the same and as a result you are told that everything works as expected, that is more than frustrating.

mattab commented 2 years ago

Thanks @sgiehl for the flow chart, that's very helpful indeed.

2 notes on this:

Now I think the problem we're all concerned about is this flow: new

This behavior is maybe the one causing all the trouble here.

I see maybe 2 possible solutions

Either solution 1

We remove this behavior and instead the "Visit attribution" stays untouched (either Direct or Referer X). This is maybe what @ts1985 suggests which makes sense.

OR solution 2

it has 2 parts:

a) We implement a new behavior (in Chart 1) such as this: "Visit attribution updated to Referer Y" only if:

b) Additionally we implement new behavior in Chart 2 ("handling of attribution cookie") So that any visitor who enters from "a known payment provider domain" then doesn't trigger the whole flow "Visitor comes to the website from referer X" because the payment provider would not be counted as a "referer" and be ignored. And the same would happen if the url contains ignore_refer[r]er. (so maybe this logic needs to be implmented in both Tracking API and also that JS code handling attribution cookie)

==

Analysis on solutions

If we go with solution 1) It will actually regress for some users. Indeed, some people want that the Referer Y which suddenly appears in a visit, should be set in the visit and overwrite the direct entry or existing website referer. (I actually don't know how big this use case is, but certainly it improves data quality for some people - Not people in this conversation though)

If we go with solution 2), in general it feels like it would fix the issue completely (maybe?!) and also not regress behavior for anyone (where ignoring a new referer might be undesired for some people). So maybe solution 2) only improves behavior & data accuracy while also fixing the issue we have here? The only downside of solution 2) is having to maintain a list of payment provider URLs etc. But maybe someone else already created this list? And maybe we could manage it in a new open source repository to encourage contributions?

adsham commented 2 years ago

Hi @sgiehl

Thanks for your thorough investigation so far, it looks like we are starting to get somewhere. It shows the complexity in the process.

I do agree with @mattab with regards to the fact there may be genuine use cases where we might want to maintain the latter referrer (even though I would say the majority would be interested in the referrer that initiated the visit). Rather than maintaining a list of exclusions which lead to issues (would the list be a global one? Would each site need an exclusion list? Would different websites have different requirements for the exclusion list?) - Could we instead have a config option that just always respects Referrer X (and overrides any update to Y)?

Also please do pardon my ignorance, I confess I do not fully appreciate the inner workings of the process despite the detailed data flow above. I have highlighted the paths that cause an update to referrer Y (from X). Could you clarify what "Prev. visit" is defined as? Is it the first step in the data flow (before being redirected back to the site, or is it a visit from the past?) Looking at it with my simple eyes and my current experiences, I am not concerned whether or not the previous step/visit was direct or campaign or website or something else (red highlighted paths). From a business point of view I want to know what got the customer to the website (was it my email or ppc campaign or was it a 3rd party referral, was it a search engine referral). If another campaign supersedes this during the lifetime of this visit - from my personal viewpoint I am not concerned, I want to maintain the initial visit referrer.

image

As a side note, I am not familiar with some of the flags such as "create_new_visit_when_website_referrer_changes" and "create_new_visit_when_campaign_changes" - Could you confirm the defaults of these for hosted Matomo users.

Many thanks

adsham commented 2 years ago

I would also agree that the data flow should go into the documentation - many users may not understand the complexity behind the attribution process.

sgiehl commented 2 years ago

@ts1985 Simply trying to implement some new stuff that solves a specific issue won't help anyone, unless we are able to guarantee that nothing else would break with it. And that is the reason why I started writing tests and creating the flow charts. To me it was important to kind of document the current behavior. That does not mean that this behavior needs to be correct, or does not need to be adjusted or could be improved. But before implementing new stuff we need to ensure that this behavior is either correct or we adjusted/fixed it so it does what it should be. Once that is clarified I'm happy to discuss how we can handle the issue regarding referring payment providers. As mentioned before currently this can be achieved by adding the payment provider to the site urls as well as using setDomains in javascript. This might for sure not be the best solution, but at least a workaround until we have something more useful.


@mattab

Could we put it on the developer guides after this issue is closed?

Sure. But unless we have a common tool to create/edit such charts, no one else might be able to update the charts later. So might be good to clarify that internally first.

In the first chart, on the bottom right, it says "Conversion is attributed to Referer Y" -> But I would have expected instead to read "Conversion is attributed to the current Visit Attribution" (which may be either X or Y depending on what the flow was). Can you confirm this is the case and if so, update the chart above?

The right side actually is for visits with referrer Y, the left side is with referrer X (the lines from the step above are continued). Guess we could unify the last part to either using cookie referrer or visit referrer and don't differ between referrer X and Y from before.

Now I think the problem we're all concerned about is this flow:

I'm not sure if that part is the problematic one. It actually only says that the referrer would be updated from DIRECT to WEBSITE in that case, which seems correct.

Regarding the payment provider referrer issue I actually see this possibility:

Additionally we could also use a new url parameter that triggers the javascript tracker to ignore the referrer. So as soon as the url parameter &ignore_refer[r]er=true|1 is provided, the javascript tracker will neither send the referrer with the tracking request, nor update the attribution cookie to the current referrer.

I'm not sure if starting to automatically hide known payment providers would be much of a help for everyone.


@adsham

Thanks for your notes. With prev. visit referrer I actually meant the referrer previously set for the current visit, so in the flow chart this would the previous step. Would it be more clear to write Referrer X instead? The default for the configuration options are: create_new_visit_when_website_referrer_changes = 0 and create_new_visit_when_campaign_changes = 1


The part in the flow charts I personally would not have expected is, that the attribution cookie might set a visit referrer if it is a campaign. This actually also means that if I visit a page from a campaign and an attribution cookie is placed, that all direct visits that happen afterwards would be attributed to the campaign. My expectation would be direct entries where the conversions might still be attributed to the campaign. But maybe that is something that is expected from a marketing/business side of view?


Anyway, I guess it would be good to maybe gather a list of changes everyone might require and afterwards create issues for each requirement/feature request separately, so the product team will be able to prioritize them.

@justinvelluppillai @mattab could you maybe take over from here and discuss and decide what we want/need to change or implement. I'm happy to do that once that was clarified.

ts1985 commented 2 years ago

A list of payment providers to ignore is not a solution. Reasons:

So a list of such providers is not the solution. The problem is the bug that the referrer of visitors is changed if they leave the website to do some third party stuff and then come back when finished.

Demichev commented 2 years ago

@mattab

I see maybe 2 possible solutions

Either solution 1

We remove this behavior and instead the "Visit attribution" stays untouched (either Direct or Referer X). This is maybe what @ts1985 suggests which makes sense.

OR solution 2

it has 2 parts: a) We implement a new behavior (in Chart 1) such as this: "Visit attribution updated to Referer Y" only if:

  • it's not a know payment provider domain (paypal, etc. etc <- we should try find a list of such URLs to include by default so we cover most cases)
  • OR the URL of the website contains &ignore_refer[r]er=true|1 url parameter (in the request where the Referer Y is suddenly set"

Can we have a mix of two solutions?

  1. You don't need to fully remove this behavior, just add new option to config as @heurteph-ei said earlier. So there is no regression here.
  2. I found that some users keep opened page from our site with referrer to payment site after purchase. So it starts new visit with payment site referrer weeks and weeks later after actual purchase. Of course we can add some JS to refresh page and reset referrer but it is good to have a list with ignored sites. And I don't like any url parameters. It can not be done in some cases.
mattab commented 2 years ago

Maybe we could do what @Demichev you suggested:

  1. introduce new INI settng update_referrer_in_visit_when_referrer_changes set to 1 by default (that people can set to 0 to ignore all referrer updates)
  2. also introduce list of payment providers including paypal.com etc to always ignore (new INI setting + setting in JS traker) so they are always ignored in the cookie and in the visit's referrer
  3. also introduce similar ignore logic when the &ignore_refer[r]er url parameter is found

@sgiehl @justinvelluppillai could we do these 3 maybe?

sgiehl commented 2 years ago
  1. introduce new INI setting update_referrer_in_visit_when_referrer_changes set to 1 by default (that people can set to 0 to ignore all referrer updates)

This one could actually make problems and I'm not sure which problems it should solve. The referrer is currently only updated when it was previously set to DIRECT. The reason for that is, that we assume it might not have been set correctly with the first hit. In some cases it might maybe happen that other tracking request(s) might be sent or processed before the initial page view is processed. If the first request does not contain the referrer, it wouldn't be updated with the pageview afterwards, which then might cause even more unexpected DIRECT hits.

From my technical point of view I would actually expect to have create_new_visit_when_website_referrer_changes = 1 by default. Having it disabled, which we currently have, actually hides all website referrers that might come in between. I don't know how other analytics solutions are handling that, but I would actually expect that every time the referrer changes (to something other than DIRECT), a new visit should be started. Technically someone needs to leave the page in order to come back with a new referrer, so why don't we count that as a new visit? Maybe we should think about cleaning that up with Matomo 5 and only have one config option like create_new_visit_when_referrer_changes = 1

  1. also introduce list of payment providers including paypal.com etc to always ignore (new INI setting + setting in JS traker) so they are always ignored in the cookie and in the visit's referrer

already started working on that here: https://github.com/matomo-org/matomo/tree/referrerexclusion can continue to work on that.

  1. also introduce similar ignore logic when the &ignore_refer[r]er url parameter is found

So the parameter will trigger the javascript not to store the referrer in the cookie and maybe even prevent sending the referrer with the tracking request. On PHP side we can also discard the referrer in that case and drop the parameter from the tracked url. But should that also affect tracking campaigns? What should happen if the tracked url contains some campaign parameters but also ignore_refer[r]er? Should we ignore the campaign in that case as well?

@justinvelluppillai once clarified: shall we create new issues for each of the points, so we can plan them into the next milestones separately?

justinvelluppillai commented 2 years ago

@sgiehl that sounds good to create separate issues. Perhaps 2. can make it into this milestone, do you think?

sgiehl commented 2 years ago

@justinvelluppillai I will go on working on it later, but not sure if it will pass the review process till the RC release.

mattab commented 2 years ago

I would actually expect that every time the referrer changes (to something other than DIRECT), a new visit should be started. Technically someone needs to leave the page in order to come back with a new referrer, so why don't we count that as a new visit?

The problem with creating new visits is that if in case of an edge case situation, then you end up creating many new visits, and that has a lot of impact on data accuracy and trust. Here is an example such edge case: if someone has tabs opened, and one of the tabs has the Old referrer, then whenever their browser reloads and tabs reload, it would create possibly 2 visits each time. It quickly becomes a bug that many people can spot and I think there are more such edge case(s).

should that also affect tracking campaigns? What should happen if the tracked url contains some campaign parameters but also ignore_refer[r]er? Should we ignore the campaign in that case as well?

I'd say either is fine. I don't have strong preference. (But if i had to choose, then maybe it's simpler to also ignore tracking campaign parameters when ignore_referer is set, so as to make sure it won't create a new visit. )

sgiehl commented 2 years ago

@mattab I have now created pull requests to cover some parts of your suggestions.

https://github.com/matomo-org/matomo/pull/19302 will introduce the possibility to set a list of referrers that should be ignored. This needs be done in the site settings (and/or globally), which causes the PHP tracker to ignore such referrers. In addition there will be a new method for the javascript tracker to ignore certain referrers. This is needed to prevent the attribution cookie to be updated, which might otherwise cause a change in conversion attribution.

https://github.com/matomo-org/matomo/pull/19420 will introduce the url parameters ignore_refer[r]er, which lets the javascript and php tracker ignore the current referrer. The url parameter is automatically removed. As discussed, this also includes campaign detection.

Imho those new features should give everyone the possibility to ignore their payment provider (or other services) as referrer. I'm actually not sure if it's worth to implement a static list of known providers. Maybe adding/updating some FAQs might be good enough.

ts1985 commented 2 years ago

A list of payment providers to ignore is not a solution. Reasons:

  • it's a bug that the referrer changes. Ignoring a list of payment providers means also that if they mention a website in their blog or link it on their website is not visible anymore. And of course I also want to track the visitors coming from a payment provider. But if someone comes to our website and uses just a payment provider I want to know where this visitor was from and not that he used the payment provider x.
  • also I'm talking here about payment provider, as written before, same issue could be for other third party services like login via Google/Facebook/whatever and maybe some other stuff.

So a list of such providers is not the solution. The problem is the bug that the referrer of visitors is changed if they leave the website to do some third party stuff and then come back when finished.

Please read this again and understand, that your solution is not a solution. This will not fix the bug. It's just a workaround which will result in more issues without fixing the main issue.

sgiehl commented 2 years ago

@ts1985 That topic is not as easy as you might think. Updating the referrer is only done in some specific cases and only if the previously stored referrer was direct. So that can't overwrite any other referrers. But that actually isn't the problem the issue topic was about. The conversion attribution is not only defined by the visit referrer. If there is an attribution cookie, that one will overwrite the conversion attribution. As this one is handled in javascript, we cant really do anything on the server side.

If you don't want a conversion being attributed to a certain service provider, you would need to unset the referrer for the javascript tracking or use setConversionAttributionFirstReferrer. Or with the new feature you can define domains/hosts that should be ignored for that. I'm aware that there might be cases where someone still wants to track the referrer when the user is initially coming from that service provider. But at least on javascript side we can't know that, as we don't have any details on the current visit there. As the new feature allows to define only subdomains or domains including a path, that should be enough to only exclude certain urls of the service provider, while still tracking others.

ts1985 commented 2 years ago

I think it is that easy. Just don't update the referrer. It's a bug that the referrer is updated for the same visit.

And another big problem, also described here by different people, is, that there are so many "direct" visits. Some compared to Google Analytics for example and that there are much less "direct" visits. So it seems that Matomo often doesn't recognize the Referrer of a visit.

sgiehl commented 2 years ago

Well, not updating the referrer anymore will bring you even more direct visits 🤷 Anyway, I can't compare where GA is getting it's referrer data from. We are tracking a referrer as soon as one is provided by the browser. This might e.g. not be the case if the referring website restricts the referrer from being sent.

ts1985 commented 2 years ago

There is a complete thread about this "too many direct entries" topic: https://forum.matomo.org/t/what-can-explain-the-mysterious-too-many-direct-entries-phenomenon/31721

It's a real problem. I know it's another problem but it seems the Referrer topic in general is problematic for Matomo.

schuetzm commented 2 years ago

We're experiencing the same problems that conversions aren't attributed correctly to the campaigns, but to visits from Paypal, or to "deref-gmx.net". The latter is an example that hasn't been mentioned yet; it's from an email campaign that includes campaign parameters in the email, but the visit (and subsequent ecommerce purchase) is still attributed to a website, with no trace of the campaign to be found in the visitor's profile. (I've disable cookies, don't know if that has anything to do with it.) What we're seeing is that according to Matomo's statistics, none of our campaigns in the recent months have led to any conversions, even though this is very unlikely in general, we see a spike in purchases after campaigns, and at least for some conversions we have evidence that they definitely came from a campaign.

@sgiehl From what you described, you intend to add a way to ignore certain referers, but that requires everyone to find out which URLs they should add, and also will exclude some legitimate referers. The example with deref-gmx.net shows how fragile and open ended that is, as you probably wouldn't have thought of this domain. (The other idea about the URL parameter wouldn't work at all, as we cannot influence it, obviously.) Could you describe how this would get us correct attributions both in the absence of a full knowledge of the referrers we have to exclude, and without requiring every single user to adjust their configuration?

From what I understand, the underlying problem here really is that the source of a visit is ever changed after the fact at all. This seems fundamentally wrong: The source of a visit is fully determined by the referrer (or parameter) of the first visited page. If, as you described above, the events are sometimes not submitted in the right order, then that can be worked around e.g. by only assigning a source after a few seconds. But importantly, once a source has been decided on, it should never be changed anymore, because it is by then a historic fact that shouldn't be affected by any later events.

It also seems wrong that direct visits are treated differently. A direct (= unknown) visit is a source like any other and shouldn't be changed after the fact either. In particular, if we don't know where a user came from at first, the fact that at a later time they came from an identifiable source still doesn't give us any knowledge about their first visit. Instead, there probably needs to be a distinct state for the source, namely "uninitialized", which would be different from a "direct visit"; this state would only exist as long as we don't have enough information to make a decision yet.

Now, this doesn't take the more complicated cases into consideration, e.g. the campaign id changing in the middle of a visit. IMO these should probably be counted as a new visit, as such a change is evidence that the user left and then returned through a different channel. But whatever is decided here, the complicated cases should not affect the bevaviour of the simple cases.

tl;dr: May I suggest to implement a mode where it only determines the source from the first page visit (or the first few visits in the first few seconds, if technically necessary), never changes it afterwards, and treats direct visits like any other? Then we can experiment with that, and if it works out, make it the default mode, leaving the previous mode for backwards compatibility or even removing it. I'm pretty sure this is what many of the participants (and lurkers) in this discussion want, even if it later might have to be adjusted to handle some edge cases.

sgiehl commented 2 years ago

Not updating the referrer at all can be easily achieved by removing a couple of methods: https://github.com/matomo-org/matomo/blob/115527353a9e75e01aa4d263408956ae45403bea/plugins/Referrers/Columns/ReferrerType.php#L58-L68 https://github.com/matomo-org/matomo/blob/115527353a9e75e01aa4d263408956ae45403bea/plugins/Referrers/Columns/ReferrerName.php#L40-L50 https://github.com/matomo-org/matomo/blob/c973567705a0065fdd7d7c7b11b80f1f0f1be350/plugins/Referrers/Columns/ReferrerUrl.php#L53-L63

Removing those would fully stop updating any referrer information after the first tracking request of a visit.

Only setting/updating it for the first page view or within a certain time frame sound fine for me, but that's not a decision I can make. ping @mattab

But: Conversions might still be attributed to another referrer. Those attributions are handled using an attribution cookie, which might be updated when returning from any external service (as by default it uses the last referrer).

adsham commented 2 years ago

Not updating the referrer at all can be easily achieved by removing a couple of methods:

https://github.com/matomo-org/matomo/blob/115527353a9e75e01aa4d263408956ae45403bea/plugins/Referrers/Columns/ReferrerType.php#L58-L68

https://github.com/matomo-org/matomo/blob/115527353a9e75e01aa4d263408956ae45403bea/plugins/Referrers/Columns/ReferrerName.php#L40-L50

https://github.com/matomo-org/matomo/blob/c973567705a0065fdd7d7c7b11b80f1f0f1be350/plugins/Referrers/Columns/ReferrerUrl.php#L53-L63

Removing those would fully stop updating any referrer information after the first tracking request of a visit.

Only setting/updating it for the first page view or within a certain time frame sound fine for me, but that's not a decision I can make. ping @mattab

But: Conversions might still be attributed to another referrer. Those attributions are handled using an attribution cookie, which might be updated when returning from any external service (as by default it uses the last referrer).

Can these be removed in hosted instances of Matomo?

sgiehl commented 2 years ago

@adsham this can be only removed if you have access to the source code. For Matomo Cloud this can't be changed.

mattab commented 2 years ago

it's from an email campaign that includes campaign parameters in the email, but the visit (and subsequent ecommerce purchase) is still attributed to a website, with no trace of the campaign to be found in the visitor's profile. (I've disable cookies, don't know if that has anything to do with it.)

@schuetzm if you disable cookies, then the "same-visit" conversions should be still attributed, but indeed any visit from a newsletter from days ago or weeks ago will not be attributed to the newsletter. Only visits generated from the newsletter and directly converting will be attributed. (ref = https://matomo.org/faq/general/faq_156/)

Could you describe how this would get us correct attributions both in the absence of a full knowledge of the referrers we have to exclude, and without requiring every single user to adjust their configuration?

the knowledge of the problematic referrers is clear to people who have the problem, because most of the goal conversions / ecommerce conversions are attributed to these "problematic referrers" so they appear in many places in the reports and it looks buggy. So if people realise this is buggy, then hopefully they will find the feature (although I can see how that won't be easy for many people).

The example with deref-gmx.net shows how fragile and open ended that is, as you probably wouldn't have thought of this domain.

That's why the feature lets you enter the domain within the UI of Matomo so you can enter that domain name there and it will apply to all websites automatically (or you can only set it for one website if you want)

If we get complaints from a few people like we did for paypal.com then we can also add it to the Matomo list of excluded referrers so it will be applied to all Matomo users. But so far it was only very obvious for paypal.com

e.g. the campaign id changing in the middle of a visit. IMO these should probably be counted as a new visit,

Fyi that's already the case, see: https://matomo.org/faq/how-to/faq_19616/

mattab commented 2 years ago

@adsham I believe your attribution issues will be fixed on Matomo Cloud once the PRs are merged and released and deployed on the Cloud. Not sure yet when this will be all done but should be by end of July.

adsham commented 2 years ago

@adsham I believe your attribution issues will be fixed on Matomo Cloud once the PRs are merged and released and deployed on the Cloud. Not sure yet when this will be all done but should be by end of July.

Thankyou @mattab for the update and looking forward to having this resolved finally.

schuetzm commented 2 years ago

Not updating the referrer at all can be easily achieved by removing a couple of methods:

--- snip ---

Removing those would fully stop updating any referrer information after the first tracking request of a visit.

Thanks! I commented them out and invalidated the historical data, but there are still many conversions that are attributed to Paypal. Would this change only apply to future visits?

sgiehl commented 2 years ago

@schuetzm Yes. That only applies to future visits.

Demichev commented 2 years ago

I disabled referrer overwrites in May in our instance of matomo. And I have analyzed 4 last months in our matomo db (direct sql queries). Referrer overwrites gave a large wrong statistics. 12 conversions in April for one referrer site but only 3 of them are correct. A common picture is that user bought a product (with direct entry) and then wanders back and forth to the blog or forum and then returns to the main site. I am not going to argue with matomo team anymore, I am used to make a series of changes in the matomo code after each applied update. But I found one strange thing: If visit starts from the page of our own site (old opened tab or something) then matomo_log_visit.referer_url field is filled with this referrer but matomo_log_visit.referer_name = NULL. Then user goes to facebook and then returns to our site and referer_name becomes facebook (when referrer overwrites are not disabled) but referer_url = . Do you think this is correct behaviour, @sgiehl ?

sgiehl commented 2 years ago

@Demichev Do you know what the referrer type was set to? Was the initial value 1 and then updated to 7? The referrer overwrites are only looking at the type afaik.

schuetzm commented 2 years ago

I can also confirm that with the above mentioned functions disabled we're now seeing more plausible statistics: many direct visits, many from particular search engines, and a few from campaigns. (Although there are still a few from Paypal, where it probably lost the connection because we're not using cookies.)

So, can we please have a switch for this behaviour in the configuration? Do you want me to open a separate issue for this feature request?

Demichev commented 2 years ago

@Demichev Do you know what the referrer type was set to? Was the initial value 1 and then updated to 7? The referrer overwrites are only looking at the type afaik.

As I can see all refs to our site with null name have type = 1 and with filled name have types 2 (search engines I guess), 3 (other sites), 7 (YouTube). I don't know what it means.

sgiehl commented 2 years ago

@Demichev See https://github.com/matomo-org/matomo/blob/53c00a78caf96d24dd8f7f74dc8fd74268b312b1/core/Common.php#L26-L30

Not sure if storing the referrer url, if it is flagged as a direct entry, is correct. An url from the same site imho should never be stored as referrer url or at least I can't see any good reason for that.

mattab commented 2 years ago

A common picture is that user bought a product (with direct entry) and then wanders back and forth to the blog or forum and then returns to the main site.

when this is happening and causing data issues, then it may be worth also tracking all sub-domains/sites in one new Matomo website using cross domain tracking https://matomo.org/faq/how-to/faq_23654/

(Alternatively another workaround (with downside) is you can also define your blog and forum as "alias URLs" in the main site in Matomo: "Alias URLs: domains and subdomains that are tracked in this website. This will ensure that tracked domains don’t appear in the Referrer report." see https://matomo.org/faq/how-to/create-and-manage-websites/ -- But downside is that then the blog and forum won't be attributed as a source directly anymore... (unless you cross link with the URLs campaign parameters)

Demichev commented 2 years ago

when this is happening and causing data issues, then it may be worth also tracking all sub-domains/sites in one new Matomo website using cross domain tracking https://matomo.org/faq/how-to/faq_23654/

I know about multi domain visitors, we plan to test it in the future.

Not sure if storing the referrer url, if it is flagged as a direct entry, is correct. An url from the same site imho should never be stored as referrer url or at least I can't see any good reason for that.

Don't know a good solution but current inconsistency when different refs are stored for url and name looks wrong for me.

sgiehl commented 2 years ago

Thanks again for all the valuable inputs.

I will close this issue here now. Based on the discussion in here, we have implemented some new features that allow excluding referrers:

I'm aware that besides those features there is still a discussion about updating referrer attributions in general. As this issue meanwhile contains too many different topics it's hard to see which problems were actually solved and which still remain. I have therefor created #19657, which is meant to only handle the topic "Updating referrer attribution". If anyone has some further input on that specific topic, feel free to comment there. If there are any other issue regarding referrer attribution, please consider creating new issues, with specific topics, so we are able to handle them easier.

MatomoForumNotifications commented 2 years ago

This issue has been mentioned on Matomo forums. There might be relevant details there:

https://forum.matomo.org/t/multi-channel-conversion-attribution-models-comparison/45580/12

MatomoForumNotifications commented 1 year ago

This issue has been mentioned on Matomo forums. There might be relevant details there:

https://forum.matomo.org/t/does-matomo-attribute-more-traffic-to-the-direct-channel/47334/2

MatomoForumNotifications commented 8 months ago

This issue has been mentioned on Matomo forums. There might be relevant details there:

https://forum.matomo.org/t/referral-exclusions/33582/34