PipelineDP is a Python framework for applying differentially private aggregations to large datasets using batch processing systems such as Apache Spark, Apache Beam, and more.
This PR intorduces QueryBuilder.mean, to allow DP mean computation. Along the way small refactoring done: allmost all validations are performed in QueryBuilderbuild_query instead of individual aggregation function (count, sum etc). That's more convenient, since the conditions on the correct state can be complicated, provided that in future the will be more aggregations and it will be allowed to compute aggregates of multiple columns.
This PR intorduces
QueryBuilder.mean
, to allow DP mean computation. Along the way small refactoring done: allmost all validations are performed inQueryBuilderbuild_query
instead of individual aggregation function (count
,sum
etc). That's more convenient, since the conditions on the correct state can be complicated, provided that in future the will be more aggregations and it will be allowed to compute aggregates of multiple columns.