Closed sasvaritoni closed 3 years ago
Sorry for the delayed response. We've made adjustments to improve notifications for questions raised like this.
There are utilities for generating synthetic data in the SmartNoise SDK. Synthetic data generation has not yet been integrated into the core library. You can find documentation for using the synthetic data features in the SmartNoise SDK here.
There are three possible ways to generate publicly releaseable datasets that are differentially private.
Approach 2 is very doable in the present SmartNoise Core release, and in the forthcoming OpenDP library.
Approach 3 (LDP) is not presently in scope for SmartNoise/OpenDP, but could be if someone wants to run with it. The mechanisms are there in the library, we just haven't thought seriously about all the required utilities.
Approach 1 is possible, but the models presently supported (VCV matrices) are not high in utility. There is some additional DP-GAN code available under the SmartNoise umbrella, but not currently integrated with the SmartNoise Core library. It exists as a separate service/process. This likely is in scope, but we don't yet have a timeline for.
Hi,
As I can understand, OpenDP is mainly meant for different statistics queries. I am wondering if OpenDP could be used to generate a differential private release of a dataset. I mean to transform the original dataset to an "anonymized" one. Is this planned for the future maybe?
Thanks, Toni