opendp / smartnoise-sdk

Tools and service for differentially private processing of tabular and relational data
MIT License
254 stars 68 forks source link

Dead quantile code in SmartNoise-SQL is vulnerable to floating-point attacks #572

Closed TedTed closed 1 year ago

TedTed commented 1 year ago

Hi folks,

This code contains a quantile function which is implemented naively. It's vulnerable to fairly trivial floating-point vulnerabilities. For example, the following code:

import snsql
from snsql.sql._mechanisms.approx_bounds import quantile
print(quantile([0]*1491+[1], 0.5, 1, 0, 1))

will always return a value, but the following code:

import snsql
from snsql.sql._mechanisms.approx_bounds import quantile
print(quantile([0]*1492+[1], 0.5, 1, 0, 1))

will always crash with "ValueError: probabilities contain NaN".

I haven't tried it, but I'm also fairly certain that it's vulnerable to precision-based attacks, for the same reason as diffprivlib and SmartNoise Core: the use of np.random.uniform is very dangerous in that context.

Thankfully, this code doesn't seem to be actually used anywhere (that I can see). Nonetheless, it probably seems worth removing it from the repository.

joshua-oss commented 1 year ago

Thanks for reporting. This is, indeed, dead and buggy code, and we will remove it. The intent was to use this to compute quantiles when we have a reliable cross-engine way to sample row values (rather than just return aggregates). When we reach that point, we'll use an algorithm that is safe, and ensure that our implementation doesn't leak values.

joshua-oss commented 1 year ago

Fixed in 1.0.1