Closed yohhaan closed 1 week ago
Hi, thanks for creating the issue. The colab demo that you reference was meant as a one-time demonstration on how one might extract and use the classifier model and not as an evergreen document. We leave it as an exercise to developers to look at Chrome's code and to keep up with further changes if they want to copy Chrome's behavior over time.
Problem Description
The documentation of the Topics API for the web links to a demo on Google Colab to perform inferences with the model used by Chrome. However, this demo does not follow the same algorithm as the one executed in Google Chrome. As a result, classifications results differ.
Some differences in the Colab:
www.
prefixI would suggest updating the Colab demo to exactly mirror Google Chrome's implementation of the Topics API for the web. This would avoid potential confusion due to classification mismatches between the Colab and Chrome implementations.
Resources
In this blog post and this paper, I describe the steps performed in Google Chrome when a hostname is classified by the Topics API, specifically see the post-filtering algorithm.
Here is my correct reimplementation of the classification performed in Google Chrome: https://github.com/yohhaan/topics_classifier