OpenMined / PyDP

The Python Differential Privacy Library. Built on top of: https://github.com/google/differential-privacy
Apache License 2.0
500 stars 138 forks source link

Adding `dp::PartitionSelectionStrategy` wrapper for Python #374

Closed levzlotnik closed 3 years ago

levzlotnik commented 3 years ago

Description

As per the issue on PipelineDP, adding support for private partition selection strategies:

  1. Truncated geometric thresholding (paper)
  2. Laplacian/Gaussian thresholding (ref)

The API is described in the issue on PipelineDP.

Affected Dependencies

No affected dependencies. Some bazel BUILD files have been changed to include @google_dp//...:partition-selection.

How has this been tested?

The following python script reproduces Figure 1 from the Paper:

import numpy as np
import matplotlib.pyplot as plt

import pydp as dp

epsilon = 0.1
delta = 1e-10
num_partitions = 1
amount_of_users = np.arange(150, 301)
# amount_of_users = np.arange(5, 21)
truncated_geometric = dp.partition_selection.create_partition_strategy(
    "truncated_geometric", epsilon, delta, num_partitions)
laplace = dp.partition_selection.create_laplace_partition_strategy(epsilon, delta, num_partitions)

probs_truncated_geometric = []
probs_laplace = []
for num_users in amount_of_users:
    sims_truncated_geometric = []
    sims_laplace = []
    for sim in range(10000):
        sims_truncated_geometric.append(truncated_geometric.should_keep(num_users))
        sims_laplace.append(laplace.should_keep(num_users))
    probs_truncated_geometric.append(np.mean(sims_truncated_geometric))
    probs_laplace.append(np.mean(sims_laplace))

plt.plot(amount_of_users, probs_truncated_geometric, color='red')
plt.plot(amount_of_users, probs_laplace, color='blue', linestyle='dashed')
plt.xlabel("$n$")
plt.ylabel(r"$\mathbb{P}[\rho(n) = keep]$")
plt.xlim(amount_of_users.min(), amount_of_users.max())
plt.ylim(0, 1)
title = "Release Probability depending on \nthe number of unique users. $\\varepsilon=%.1e, \\delta=%.1e$" % (
    epsilon, delta)
plt.title(title)
plt.show()

Checklist

chinmayshah99 commented 3 years ago

Also, style tests are failing. Please look into that. You can run make to format both python and cpp code

levzlotnik commented 3 years ago

@dvadym @chinmayshah99 I added tests for Truncated Geometric partition selection. For Laplace/Gaussian mechanisms partition selections python tests I would require the Laplace/Gaussian mechanisms from #372.

levzlotnik commented 3 years ago

@chinmayshah99 @dvadym thank you for approving the PR! 😄