Theory

This PRs implements post-aggregation thresholding, namely it implements the following partition selection algorithm.

1. Noise stddev and threshold T computation: from (eps, delta,
  l0_sensitivity=max_partition_contributed).
2. Contribution bounding: for each privacy unit chosen partitions in which
  it contributes, if there are more than max_partition_contributed,
  max_partition_contributed partitions are randomly sampled.
3. Aggregation: for each partition the count of privacy unit is computed
4. Selection: for each partition with n privacy units, it’s released iff
   num_privacy_units + noise >= T.  In case of releasing num_privacy_units + noise is released as well.

The details on computing noise stddev and T can be found in doc. Those computations are implemented in Google C++ building block libraries and Python wrappers from PyDP are used.

This algorithm is called post-aggregation thresholding because it uses aggregated values of the number of privacy units.

What this PR contains:

This PRs contains the whole implementation of Post aggregation thresholding, namely:

ThresholdMechanism class which is wrapper around PyDP object. It's needed for combiner in order to be able to use PyDP object.
PostAggregationThresholdingCombiner combiner which computes privacy id counts and applies ThresholdMechanism.
Extending AggregateParams with bool variable post_aggregation_thresholding
Creating PostAggregationThresholdingCombiner object, which is created when post_aggregation_thresholding = False
Filtering partitions on "privacy_id_count = None", when post_aggregation_thresholding = False

OpenMined / PipelineDP

Post aggregation thresholding #494

Theory

What this PR contains: