OpenMined / PipelineDP

PipelineDP is a Python framework for applying differentially private aggregations to large datasets using batch processing systems such as Apache Spark, Apache Beam, and more.
https://pipelinedp.io/
Apache License 2.0
270 stars 75 forks source link

Small updates of typing and documentation for CustomCombiner #450

Closed dvadym closed 1 year ago

dvadym commented 1 year ago

CustomCombiner is the way how client can implement custom DP aggregation, with PipelineDP manages all infrastructure stuff (Contribution bounding, Partition selection, budget management etc).

In the initial version of this API, it was assumed that CustomCombiner requests the budget only once. But CustomCombiner might need to request multiple budget objects, for example for computing mean, it's needed to add noise 2 times - for count and sum. This PR changes that. CustomCombiner can use any structure for keeping budgets (e.g. dictionary) rather than 1 budget object.

dvadym commented 1 year ago

Thanks for review!