Closed fteverini closed 3 years ago
Yes, two neighboring cohorts are likely to be based on similar SimHashes. More info on how the SimHash is mapped to a cohort ID is here: https://www.chromium.org/Home/chromium-privacy/privacy-sandbox/floc
Yup, thanks Don, just the link I would have posted.
Joining up neighboring cohorts is easy, and should indeed do a pretty good job.
It is possible to do something more complicated and probably do a somewhat better job of gathering cohorts together, based on looking at how many bits of SimHash contribute to each cohort ID. If you're interested, the go-language re-implementation of FLoC clustering by @shigeki is probably the best way to explore this: https://github.com/shigeki/floc_simulator. The goal would be to group together adjacent flocks if they come from a simhash with a common prefix — so if you have cohorts {A= "00", B="01", C="1*"}, then it's better to combine A with B than B with C. But this is probably more work than it's worth.
In the current trial version, does the floc id keep a relevant order ? i.e. floc cohort 12234 is close to floc cohort 12235, or there is not any logic between cohort ids ? (For example, there would be a logic in the case the prefixes corresponding to the flocs are numerated following some logic to obtain the final id of the cohort) (And if there is a logic in current trial, is it planned to keep this logic for next floc versions ?) Thanks