OpenMined / PipelineDP

PipelineDP is a Python framework for applying differentially private aggregations to large datasets using batch processing systems such as Apache Spark, Apache Beam, and more.
https://pipelinedp.io/
Apache License 2.0
270 stars 75 forks source link

Implement mean in DataFrame API #505

Closed dvadym closed 7 months ago

dvadym commented 8 months ago

This PR intorduces QueryBuilder.mean, to allow DP mean computation. Along the way small refactoring done: allmost all validations are performed in QueryBuilderbuild_query instead of individual aggregation function (count, sum etc). That's more convenient, since the conditions on the correct state can be complicated, provided that in future the will be more aggregations and it will be allowed to compute aggregates of multiple columns.