patcg-individual-drafts / topics

The Topics API
https://patcg-individual-drafts.github.io/topics/
Other
589 stars 168 forks source link

Improving added value of Topics #319

Open remysaissy opened 2 weeks ago

remysaissy commented 2 weeks ago

Teads believes that the most important thing is to understand how rare a given signal is. Why? Because topics link browsing history with context in a privacy compliant way. This characteristic can be used when joining them on the server side in order to build contextual clusters without third party cookies.

To achieve this the current design should evolve and allow adding a weight value besides each observed topic. That weight is the relative weight of a given topic for a browser compared to all browsers using the topic.

Also, all browsers can be considered within two scopes:

For this latter case, a TEE service might enable an Ad-Tech to support the cost of that analysis in a privacy compliant way

To take an analogy, this algorithm works like a TF-IDF. A TF-IDF analyzes the occurrences of words in a document. In Teads case, we consider that words are topics and documents are browsers.

jkarlin commented 2 weeks ago

This is largely a duplicate of #42 where we discussed inverse frequency. In the end, we did model inverse frequency and found that it did not correlate terribly well to topic value as according to values provided by a few sellers and buyers.