whotracksme / whotracks.me

Data from the largest and longest measurement of online tracking.
https://www.ghostery.com/whotracksme
MIT License
407 stars 73 forks source link

Piwik PRO and Matomo should be separated #264

Closed philipp-classen closed 2 years ago

philipp-classen commented 2 years ago

Currently, we mix data from both Piwik PRO and Matamo in one entry, even though they are different entities: https://www.whotracks.me/trackers/piwik.html

There should be two separate entries that are not linked: one for Matamo and one for Piwik PRO.

philipp-classen commented 2 years ago

There is a tracker entry for matomo (covering the Matomo domains) and one called piwik (covering piwik.org and piwik.pro). Here is a summary of the steps done so far to split it:

The effect should become visible in the next monthly update to the WhoTracksMe data set and the website. (Depending on the outcome, some more tweaking might be necessary.)

philipp-classen commented 2 years ago

With the March release of the February data, there are now two separate pages:

philipp-classen commented 2 years ago

The old page is still generated, even though it no longer received updates (it has data from October 2020 to January 2022, but February 2022 is no longer included): https://www.whotracks.me/trackers/piwik.html

So, currently the old page still is present and the information is incorrect (Piwik Pro mapped to piwik.org owned by Matomo). Ideally, it should be removed (if possible) or the text should be changed.

philipp-classen commented 2 years ago

(As a reminder) Another observation is that Piwik PRO is currently shown with a long list of subdomains, while for other trackers, the list is mostly limited to a domain level, or to a selective group of subdomains (e.g. analytics.tiktok.com). At the moment, it is not consistent how the recently added trackers are handled compared to the older ones.

To be consistent with the existing trackers, the *.piwik.pro entries could be shown as one piwik.pro entry to improve consistency with the existing data. The same observation also applies to the recently added entry for MS Clarity (https://github.com/whotracksme/whotracks.me/issues/265).

philipp-classen commented 2 years ago

(As a reminder) Another observation is that Piwik PRO is currently shown with a long list of subdomains, while for other trackers, the list is mostly limited to a domain level, or to a selective group of subdomains (e.g. analytics.tiktok.com). At the moment, it is not consistent how the recently added trackers are handled compared to the older ones.

To be consistent with the existing trackers, the *.piwik.pro entries could be shown as one piwik.pro entry to improve consistency with the existing data. The same observation also applies to the recently added entry for MS Clarity (#265).

That should be fixed now. Will leave the ticket open to verify that the next monthly website release (in the upcoming days) does not re-introduce the problem.

Note: The original URL is should be no longer reachable (https://www.whotracks.me/trackers/piwik.html). It has been split into https://whotracks.me/trackers/matomo.html and https://whotracks.me/trackers/piwik_pro_analytics_suite.html.

philipp-classen commented 2 years ago

The April release is out. Verified that the outdated page was not re-introduced.