Closed daniel-vos closed 1 year ago
Hi Daniel, thank you for reporting this bug, we will work to fix this for the next release.
As regards performance, you're right, this is a naive implementation. We are currently finalising a paper that looks at an alternative method and will update you here when we can share it.
When giving a list / array for
quant
in thediffprivlib.tools.quantile
function,random_state
that is given is not passed along to the calls that the function makes under the hood which means results vary between calls with the same random seed.Reproduce
Currently this assertion will fail but of course should be the same between calls. I'm on version
diffprivlib==0.6.2
.Fix Although it's easy to fix this (pass
random_state=random_state
along with the call at line 117 ofquantile.py
) the current method uses more epsilon budget than is required when computing multiple quantiles. E.g. the algorithm in 'Differentially Private Quantiles' by Jennifer Gillenwater, Matthew Joseph, Alex Kulesza appears to score significantly better than independent exponential mechanism calls. Maybe this is something to consider?