matomo-org / matomo

Empowering People Ethically with the leading open source alternative to Google Analytics that gives you full control over your data. Matomo lets you easily collect data from websites & apps and visualise this data and extract insights. Privacy is built-in. Liberating Web Analytics. Star us on Github? +1. And we love Pull Requests!
https://matomo.org/
GNU General Public License v3.0
19.7k stars 2.62k forks source link

[Bug] Internal URLs in Acquisition/Websites #21633

Closed ke-kialo closed 7 months ago

ke-kialo commented 9 months ago

What happened?

We sometimes see URLs of our own website under Acquisition/Websites when expanding the subtable of one external Website (the ones starting with p/ are our internal URLs):

kialo-edu-com-2023-12-03-Web-Analytics-Reports-Matomo

We would expect pages of the expanded website in there.

When further inspecting those internal URLs by clicking on "Open the segmented visit logs" next to them we see something like this:

matomo_bug3

The website is the external URL but hovering it reveals that it is actually the internal URL which should not appear in there.

This report in the forum seems similar.

We have set our internal URLs in Settings -> Websites -> Manage -> our-site -> Edit -> URLs.

We use Matomo 4.15.1.

The issue is independent of the browser used.

What should happen?

We would not expect internal URLs in Acquisition/Websites.

How can this be reproduced?

We did not find a way to reproduce the bug but found several occurrences in our data.

Matomo major version

Matomo 4

Matomo minor or patch Version

4.15.1

PHP version

8.2.13

Server operating system

docker.io/matomo:4.15.1-fpm-alpine

What browsers are you seeing the problem on?

Chrome, Not applicable (e.g. an API call etc.)

Computer operating system

Ubuntu

Relevant log output

No response

Validations

mneudert commented 9 months ago

Hi @ke-kialo, thank you for raising this issue.

Could you provide some more details to find out what is happening here?

If you look at the bottom of the websites report, there should be an export button:

Screenshot from 2023-12-05 16-34-51

If you click on that, and then select "XML", "Flatten report", and then click on "Export", you should get an export with entries looking like this:

<row>
  <label>referrer.example.org/url?sa=t&rct=j</label>
  <!-- some more fields stripped -->
  <url>http://referrer.example.org/url?sa=t&rct=j</url>
  <Referrers_Website>referrer.example.org</Referrers_Website>
  <Referrers_WebsitePage>url?sa=t&rct=j</Referrers_WebsitePage>
</row>

Can you check what that export contains for the problematic rows? For example, is the information there already wrong (external "Referrers_Website", but internal "url" fields)? And what does the full referrer look like, do any parameters exist that contain your own website and that may be interpreted badly?

ke-kialo commented 9 months ago

@mneudert thank you for the answer.

Here is a snippet from the XML export:

<row>
  <label>classroom.google.com/p/<access_link_token_from_our_website>/122902</label>
  ...
  <max_actions>29</max_actions>
  <sum_visit_length>703</sum_visit_length>
  <url>https://www.kialo-edu.com/p/<access_link_token_from_our_website>/122902</url>
  <Referrers_Website>classroom.google.com</Referrers_Website>
  <Referrers_WebsitePage>p/<access_link_token_from_our_website>/122902</Referrers_WebsitePage>
</row>

The Referrers_Website is correct, the url is from our website, the Referrers_WebsitePage as well, and the label is a mixture of the referrers website and our page (a page like this does not exist on the referrer website).

ke-kialo commented 8 months ago

@mneudert do you need anything else?

michalkleiner commented 7 months ago

@ke-kialo just an idea here — are you sending the tracking in some custom way or is all just a standard JS tracker installed on the pages or similar? I'm thinking if it was custom that the URL might be missing the leading / and it's getting the host/domain added to make it a full URL. I might be completely wrong here, just trying to think out loud.

Secondly, is this still happening or was it an issue in the past only?

And lastly, would it be possible (and safe, of course) to upgrade to the latest Matomo 5.0.2 to see if the issue still persists there? Or run it in parallel with your current 4.x instance and do some more testing?

@mneudert any other ideas?

ke-kialo commented 7 months ago

We updated to matomo 5.0.2 about a week ago and this seems to have fixed the issue. So this could be closed. (We use standard tracking code.)