PipelineDP is a Python framework for applying differentially private aggregations to large datasets using batch processing systems such as Apache Spark, Apache Beam, and more.
Introducing StatisticsCombiner per-partition utility analysis combiner, for computing count, privacy_id_count (in future it can be extended to more statistics)
Adding metrics.Statistics(privacy_id_count, count) to PerPartitionMetrics.
Plumbing work to fillPerPartitionMetrics.statistincs
This PR contains:
StatisticsCombiner
per-partition utility analysis combiner, for computing count, privacy_id_count (in future it can be extended to more statistics)metrics.Statistics(privacy_id_count, count)
toPerPartitionMetrics
.PerPartitionMetrics.statistincs