Closed philipp-classen closed 2 years ago
From the raw data, the most popular pages in my sample data were https://www.nfl.com/ and https://www.lowes.com/. Both sent 3rd party requests to https://analytics.tiktok.com/i18n/pixel/identify.js with the same cookie:
cookie: ttwid=1%7CbA3a_4BMyb4ekrjPp4aDV20oq9YBjHAHqUZGZ1b-0aA%7C1641585649%7C11cda80986961557d1772f348d7ae2fce7887e551b671e1faf2343b0058cbd40
On a different profile, I got another unique identifier. Not clear why we don't detect it, I'll mark it as a bug. This is clearly cross-site tracking, and the amount of traffic is not too small that can miss it because of that. (The Ghostery extension reports it also as tracker, by the way.)
We are confident that we tracked it down in our internal processing pipeline (a missing mapping step). Existing trackers are not affected, but new ones will not be detected. In the raw data, everything is there, so once we fixed it, they should show up when the data is recomputed.
From what I see in sample data, analytics.tiktok.com
is one of the largest that we miss. static.cloudflareinsights.com
is another candidate that we should investigate (also had the most traffic).
The data from January has been now processed. analytics.tiktok.com
is now detected as a tracker:
https://whotracks.me/trackers/tiktok_analytics.html
It could be that the stats will change with the next month, as we made the internal changes in the middle of January. That could affect the estimated popularity (increasing its relative ranking). Otherwise, the data looks OK as far as I can tell (e.g. the sites from the samples nfl.com
and lowes.com
are among the most prominent pages).
Leaving it open till March to compare whether the data changes with a full month.
For reference, this is how it looks now:
Closing it now. With the March release, the reach increased (0.7% to 1.2%), but the data looks stable.
Currently, it is not detected as a tracker by WhoTracks.me, but perhaps it should be. In the raw data (
tp_events
), third-party requests do exist. The question is now whether it should have be detected as a third-party tracker. And if so, why are the algorithms missing it?(Context: originally reported here https://github.com/whotracksme/whotracks.me/issues/261)