opendp / smartnoise-sdk

Tools and service for differentially private processing of tabular and relational data
MIT License
254 stars 69 forks source link

Add support for upper and lower bounds on datetimes #506

Open FishmanL opened 2 years ago

FishmanL commented 2 years ago

Specifically, so I can do things like select avg(end_date-start_date) from table

joshua-oss commented 1 year ago

This is possible. For scenarios like the above, though, adding a column like "duration" can often have better utility, since you can bound the sensitivity more tightly. For example, the view exposed to SmartNoise could have a column that is computed as end_date - start_date, and the bounds could be set by the data curator to be reasonably tight based on domain expertise or some previous analysis on similar data that can provide a reasonable (e.g.) 1.5X IQR to remove outliers.

FishmanL commented 1 year ago

Yep, was planning to first-pass it with duration -- just trying to allow for more complicated things with datetimes than I might think of in advance (overall product is a wrapper around arbitrary queries)