prestodb / airlift

Airlift framework for building REST services
Apache License 2.0
1 stars 42 forks source link

Add PrivateLpcaSketch for private cardinality estimation #52

Closed jonhehir closed 2 years ago

jonhehir commented 2 years ago

PrivateLpcaSketch is essentially a privacy-preserving analog to the existing HyperLogLog sketch. Where HyperLogLog stores small integer values in each bucket, PrivateLpcaSketch stores only a single bit indicating whether the corresponding value is above or below a threshold fixed at initialization. This minimization, combined with some additional noise, yields a sketch from which it is considerably more difficult to identify an individual value's presence or absence. Because of the relationship between PrivateLpcaSketch and HyperLogLog, the former is always created (or updated) from the latter. (In short, a HyperLogLog is privatized to a PrivateLpcaSketch.)

jonhehir commented 2 years ago

cc @DanielTing @ctgraves @BrandonVo @bpxlion @chenkuei