opendp / smartnoise-core

Differential privacy validator and runtime
MIT License
289 stars 33 forks source link
differential-privacy opendp privacy smartnoise

Notice: SmartNoise-Core is deprecated. Please migrate to the OpenDP library:


.github/workflows/rust.yml License: MIT


SmartNoise Core Differential Privacy Library

Please see the accompanying SmartNoise Documentation, SmartNoise SDK repository, and SmartNoise Samples repository accompanying repositories for this system.


Differential privacy is the gold standard definition of privacy protection. The SmartNoise project, in collaboration with OpenDP, aims to connect theoretical solutions from the academic community with the practical lessons learned from real-world deployments, to make differential privacy broadly accessible to future deployments. Specifically, we provide several basic building blocks that can be used by people involved with sensitive data, with implementations based on vetted and mature differential privacy research. Here in the Core, we provide a pluggable open source library of differentially private algorithms and mechanisms for releasing privacy preserving queries and statistics, as well as APIs for defining an analysis and a validator for evaluating these analyses and composing the total privacy loss on a dataset.

The mechanisms library provides a fast, memory-safe native runtime for validating and running differentially private analyses. The runtime and validator are built in Rust, while Python support is available and R support is forthcoming.

Differentially private computations are specified as an analysis graph that can be validated and executed to produce differentially private releases of data. Releases include metadata about accuracy of outputs and the complete privacy cost of the analysis.


More about SmartNoise Core

Components

The primary releases available in the library, and the mechanisms for generating these releases, are enumerated below. For a full listing of the extensive set of components available in the library see this documentation.

Statistics Mechanisms Utilities
Count Gaussian Cast
Histogram Geometric Clamping
Mean Laplace Digitize
Quantiles Filter
Sum Imputation
Variance/Covariance Transform

Architecture

There are three sub-projects that address individual architectural concerns. These sub-projects communicate via protobuf messages that encode a graph description of an arbitrary computation, called an analysis.

1. Validator

The core library, is the validator, which provides a suite of utilities for checking and deriving sufficient conditions for an analysis to be differentially private. This includes checking if specific properties have been met for each component, deriving sensitivities, noise scales and accuracies for various definitions of privacy, building reports and dynamically validating individual components. This library is written in Rust.

2. Runtime

There must also be a medium to execute the analysis, called a runtime. There is a reference runtime written in Rust, but runtimes may be written using any computation framework--be it SQL, Spark or Dask--to address your individual data needs.

3. Bindings

Finally, there are helper libraries for building analyses, called bindings. Bindings may be written for any language, and are thin wrappers over the validator and/or runtime(s). Language bindings are currently available for Python, with support for at minimum R, Rust and SQL forthcoming.

Note on Protocol Buffers

Communication among projects is handled via Protocol Buffer definitions in the /validator-rust/prototypes directory. All three sub-projects implement:

At some point the projects have compiled cross-platform (more testing needed). The validator and reference runtime compile to standalone libraries that may be linked into your project, allowing communication over C foreign function interfaces.

Installation

Refer to troubleshooting.md for install problems.

PyPi packages

Refer to core-python which contains python bindings, including links to PyPi packages.

Crates.io

The crates are intended for library consumers.

The Rust Validator and Runtime are available as crates:

From Source

The source install is intended for library developers.

You may find it easier to use the library with this repository set up as a submodule of some set of language bindings. In this case, switch to the language bindings setup. You can still push commits and branches from the core submodule of whatever bindings language you prefer.

  1. Clone the repository

    git clone git@github.com:opendp/smartnoise-core.git
  2. Install system dependencies (rust, gcc)
    Mac:

    curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
    xcode-select --install

    Linux:

    curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
    sudo apt-get install diffutils gcc make m4

    Windows: Install WSL and refer to the linux instructions.

  3. In a new terminal:
    Build crate

    cargo build

    Test crate

    cargo test

    Document crate

    cargo rustdoc --open

    Build production docs

    ./build_docs.sh

There are crates in validator-rust and runtime-rust, and a virtual crate in root that runs commands on both. Switch between crates via cd, or by setting the manifest path --manifest-path=validator-rust/Cargo.toml.


Getting Started

Jupyter Notebook Examples

We have numerous Jupyter notebooks demonstrating the use of the Core library and validator through our Python bindings. These are in our accompanying samples repository which has exemplars, notebooks and sample code demonstrating most facets of this project.

Relative error distributions Release box plots Histogram releases Utility simulations Bias simulations

SmartNoise Core Rust Documentation

The Rust documentation includes full documentation on all pieces of the library and validator, including extensive component by component descriptions with examples.

Communication

Releases and Contributing

Please let us know if you encounter a bug by creating an issue.

We appreciate all contributions. We welcome pull requests with bug-fixes without prior discussion.

If you plan to contribute new features, utility functions or extensions to the core, please first open an issue and discuss the feature with us.

Contributing Team

Joshua Allen, Christian Covington, Ethan Cowan, Eduardo de Leon, Ira Globus-Harris, James Honaker, Jason Huang, Michael Phelan, Raman Prasad, Michael Shoemate, Saniya Vahedian Movahed, You?