Open RaphelWei opened 2 weeks ago
This paper formally defines $\rho$-reproducibility.
It mainly discusses several aspects:
These definitions revolve around the "Heavy Hitter" problem, which deals with identifying elements in a distribution that appear frequently (above a certain threshold).
Concept: This defines the problem of identifying approximately the heavy hitters of a distribution $$D$$, given a set of samples from $$D$$.
Formal Description: Given sample access to the distribution $$D$$, the goal is to find a set $$L$$ that approximates the set of heavy hitters ($$L_v$$) within some error margin $$\epsilon$$.
Explanation: In practice, you don't have access to the entire distribution $$D$$, only samples from it. The problem is to approximate the set of $$v$$-heavy-hitters. Due to the approximation, there's a margin of error $$\epsilon$$. You want the output set $$L$$ to include all elements that are heavy hitters with some leeway $$v + \epsilon$$, but you also want to exclude elements that aren't heavy hitters up to $$v - \epsilon$$. This ensures that the approximation is both conservative and inclusive within that error range.
paper