SmartNoise SDK: Tools for Differential Privacy on Tabular Data

The SmartNoise SDK includes 2 packages:

smartnoise-sql: Run differentially private SQL queries
smartnoise-synth: Generate differentially private synthetic data

To get started, see the examples below. Click into each project for more detailed examples.

SQL

Install

pip install smartnoise-sql

Query

import snsql
from snsql import Privacy
import pandas as pd

csv_path = 'PUMS.csv'
meta_path = 'PUMS.yaml'

data = pd.read_csv(csv_path)
privacy = Privacy(epsilon=1.0, delta=0.01)
reader = snsql.from_connection(data, privacy=privacy, metadata=meta_path)

result = reader.execute('SELECT sex, AVG(age) AS age FROM PUMS.PUMS GROUP BY sex')

print(result)

PUMS.csv and PUMS.yaml can be found in the datasets folder.

See the SQL project

Synthesizers

Install

pip install smartnoise-synth

MWEM

import pandas as pd
import numpy as np

pums = pd.read_csv(pums_csv_path, index_col=None) # in datasets/
pums = pums.drop(['income'], axis=1)
nf = pums.to_numpy().astype(int)

synth = snsynth.MWEMSynthesizer(epsilon=1.0, split_factor=nf.shape[1]) 
synth.fit(nf)

sample = synth.sample(10) # get 10 synthetic rows
print(sample)

PATE-CTGAN

import pandas as pd
import numpy as np
from snsynth.pytorch.nn import PATECTGAN
from snsynth.pytorch import PytorchDPSynthesizer

pums = pd.read_csv(pums_csv_path, index_col=None) # in datasets/
pums = pums.drop(['income'], axis=1)

synth = PytorchDPSynthesizer(1.0, PATECTGAN(regularization='dragan'), None)
synth.fit(pums, categorical_columns=pums.columns.values.tolist())

sample = synth.sample(10) # synthesize 10 rows
print(sample)

See the Synthesizers project

Communication

You are encouraged to join us on GitHub Discussions
Please use GitHub Issues for bug reports and feature requests.
For other requests, including security issues, please contact us at smartnoise@opendp.org.

Releases and Contributing

Please let us know if you encounter a bug by creating an issue.

We appreciate all contributions. Please review the contributors guide. We welcome pull requests with bug-fixes without prior discussion.

If you plan to contribute new features, utility functions or extensions to this system, please first open an issue and discuss the feature with us.

opendp / smartnoise-sdk

readme

SmartNoise SDK: Tools for Differential Privacy on Tabular Data

SQL

Install

Query

Synthesizers

Install

MWEM

PATE-CTGAN

Communication

Releases and Contributing