OpenMined / PipelineDP

PipelineDP is a Python framework for applying differentially private aggregations to large datasets using batch processing systems such as Apache Spark, Apache Beam, and more.
https://pipelinedp.io/
Apache License 2.0
274 stars 77 forks source link

Add logic for finding `max_sum_per_partition` candidates #484

Closed RamSaw closed 1 year ago

RamSaw commented 1 year ago

Uses linf_sum_contributions_histogram to compute that, candidates are taken as max values of bins, bins are sampled uniformly.

min_sum_per_partition is not implemented and is always zero.

utility analysis for sum was not implemented, this PR also adds support for it modifying _sum_metrics_to_data_dropped method.