OpenMined / PipelineDP

PipelineDP is a Python framework for applying differentially private aggregations to large datasets using batch processing systems such as Apache Spark, Apache Beam, and more.
https://pipelinedp.io/
Apache License 2.0
270 stars 75 forks source link

Pre-thresholding #443

Closed dvadym closed 1 year ago

dvadym commented 1 year ago

This PR implements pre-thresholding in PipelineDP, it contains

  1. Extending AggregateParams, SelectPartitionParams with pre_threshold
  2. Propagating pre_threshold to PyDP create_partition_selection

Pre-thresholding itself happens in C++ partition selection algorithm.