Closed mjuarezm closed 3 years ago
Hi @mjuarezm! We are indeed interested in trying a taxonomy-based FLoC clustering mechanism like the one in the whitepaper. But Chrome doesn't have any preexisting on-device mechanism for understanding the contents of pages according to any taxonomy. So the one-hot encoding of domain names is what we know how to do right now, and the more ambitious clustering proposals are candidates for future work.
Thanks for the clarification!
Version "chrome.2.1" implements the one-hot encoding of the eTLD+1 domain names to encode the user profile. However, the experiments in the FLoC whitepaper show that the TF-IDF and the taxonomy-based ("Vert depth 3" in the paper) encodings perform better than one-hot encoding.
What motivated the decision to use one-hot encoding given the results shown in these figures?
Apologies if I missed something.