microsoft / APSI

APSI is a C++ library for Asymmetric (unlabeled or labeled) Private Set Intersection.
MIT License
186 stars 42 forks source link

8c16g, how to handle big data. At about 100w, bundle.cache() takes up 17g #52

Closed yellow123Nike closed 1 year ago

yellow123Nike commented 1 year ago

8c16g, how to handle big data. At about 100w, bundle.cache() takes up 17g

kimlaine commented 1 year ago

Unfortunately, there is a significant cost in encoding a dataset into a SenderDB object. This can be somewhat controlled by adjusting the parameters. Another option might be to shard your dataset into smaller parts and then index into those using some mechanism. Of course this leaks some information then about which shard the query lands in. A third option is to get a machine with more RAM.