OpenMined / PipelineDP

PipelineDP is a Python framework for applying differentially private aggregations to large datasets using batch processing systems such as Apache Spark, Apache Beam, and more.
https://pipelinedp.io/
Apache License 2.0
270 stars 75 forks source link

Precomputation of histogram tails #431

Open RamSaw opened 1 year ago

RamSaw commented 1 year ago

Description

A histogram has a sequence of bins. In many situations it is useful to be able to have some aggregated statistics given a bin.lower (aka bin id). In this cl we will introduce aggregations over the tail. For example, given bin.lower = k, what is the total sum of counts from bins with bin.lower >= k.

Checklist