PipelineDP is a Python framework for applying differentially private aggregations to large datasets using batch processing systems such as Apache Spark, Apache Beam, and more.
Before this CL, the output of UtilityAnalysis is list of metrics of the format:
Private partitions: [PartitionSelectionMetrics, AggregateErrorMetrics, PartitionSelectionMetrics ...]
Where each consecutive pair PartitionSelectionMetrics and AggregateErrorMetrics correspond to one Utility Analysis configuration (UtilityAnalysis can run for multiple parameters simultaneously, e.g. for differemt max_partition_contributed).
Public partitions: [AggregateErrorMetrics, AggregateErrorMetrics ...]
Notes:
PartitionSelectionMetrics contains partition selection metrics (e.g. the expected number of partitions)
AggregateErrorMetrics contains error per partitions (e.g. average error per partition)
Having the output in different format for private and public partitions is very inconvenient and requires calling code to do this processing. This PR introduced class AggregateMetrics which contains those both PartitionSelectionMetrics and AggregateErrorMetrics.
Before this CL, the output of UtilityAnalysis is list of metrics of the format:
Private partitions: [PartitionSelectionMetrics, AggregateErrorMetrics, PartitionSelectionMetrics
...]
Where each consecutive pairPartitionSelectionMetrics
andAggregateErrorMetrics
correspond to one Utility Analysis configuration (UtilityAnalysis can run for multiple parameters simultaneously, e.g. for differemt max_partition_contributed).Public partitions:
[AggregateErrorMetrics, AggregateErrorMetrics ...]
Notes:
PartitionSelectionMetrics
contains partition selection metrics (e.g. the expected number of partitions)AggregateErrorMetrics
contains error per partitions (e.g. average error per partition)Having the output in different format for private and public partitions is very inconvenient and requires calling code to do this processing. This PR introduced class
AggregateMetrics
which contains those bothPartitionSelectionMetrics
andAggregateErrorMetrics
.