pytorch / opacus

Training PyTorch models with differential privacy
https://opacus.ai
Apache License 2.0
1.65k stars 328 forks source link

fix: make prv accountant robust to larger epsilons #606

Closed Solosneros closed 7 months ago

Solosneros commented 9 months ago

Types of changes

Motivation and Context / Related issue

Hi,

this PR fixes https://github.com/pytorch/opacus/issues/601 and https://github.com/pytorch/opacus/issues/604.

It will introduce the same fix as in https://github.com/microsoft/prv_accountant/pull/38. Lukas (author of prv accountant, @wulu473) said that In general, adding any additional points is safe and won't affect the robustness negatively.

The cause of these errors seems to be the grid for computing the mean() function of the PrivacyRandomVariableTruncated class. The grid (points variable) used to compute the mean is constant apart from the lowest (self.t_min) and highest point (self.t_max).

This PR determines the grid (points variable) based on the lowest and highest point. More information is below.

Best

Observation

I debugged the code and arrived at some point at the mean() function of the PrivacyRandomVariableTruncated class. The grid (points variable) used to compute the mean is constant apart from the lowest (self.t_min) and highest point (self.t_max). See the line of code here. It looks like this [self.tmin, -0.1, -0.01, -0.001, -0.0001, -1e-05, 1e-05, 0.0001, 0.001, 0.01, 0.1, self.tmax].

It seems that the tmin and tmax are of the order of [-12,12] for the examples that I posted above and even up to [-48,48] for the example that @jeandut posted in the https://github.com/pytorch/opacus/issues/604 issue whereas they are more like [-7,7] for the readme example for DP-SGD.

We suspect that the integration breaks down when the gridspacing between between tmin / tmax get's too large.

Proposed solution

Determine the points grid based on tmin and tmax but determines the start and end of the logspace based on tmin and tmax.

Before: (https://github.com/pytorch/opacus/blob/95df0904ae5d2b3aaa26b708e5067e9271624036/opacus/accountants/analysis/prv/prvs.py#L99-L106)

After:

# determine points based on t_min and t_max
lower_exponent = int(np.log10(np.abs(self.t_min)))
upper_exponent = int(np.log10(self.t_max))
points = np.concatenate(
    [
        [self.t_min],
        -np.logspace(start=lower_exponent, stop=-5, num=10),
        [0],
        np.logspace(start=-5, stop=upper_exponent, num=10),
        [self.t_max],
    ]
)

How Has This Been Tested (if it applies)

I ran the examples from the issues https://github.com/pytorch/opacus/issues/601 and https://github.com/pytorch/opacus/issues/604 and they don't break anymore.

import opacus
target_delta = 0.001
target_epsilon = 20
steps = 5000
sample_rate=0.19120458891013384

for target_epsilon in [20, 50]:
    noise_multiplier = opacus.privacy_engine.get_noise_multiplier(target_delta=target_delta, target_epsilon=target_epsilon, steps=steps, sample_rate=sample_rate, accountant="prv")
    prv_accountant = opacus.accountants.utils.create_accountant("prv")
    prv_accountant.history = [(noise_multiplier, sample_rate, steps)]
    obtained_epsilon = prv_accountant.get_epsilon(delta=target_delta)
    print(f"target epsilon {target_epsilon}, obtained epsilon {obtained_epsilon}")

target epsilon 20, obtained epsilon 19.999332284974717 target epsilon 50, obtained epsilon 49.99460075990896

target_epsilon = 4
batch_size = 50
epochs = 5
delta = 1e-05
expected_len_dataloader = 500 // batch_size
sample_rate = 1/expected_len_dataloader

noise_multiplier = opacus.privacy_engine.get_noise_multiplier(target_delta=target_delta, target_epsilon=target_epsilon, epochs=epochs, sample_rate=sample_rate, accountant="prv")
prv_accountant = opacus.accountants.utils.create_accountant("prv")
prv_accountant.history = [(noise_multiplier, sample_rate, int(epochs / sample_rate))]
obtained_epsilon = prv_accountant.get_epsilon(delta=target_delta)
print(f"target epsilon {target_epsilon}, obtained epsilon {obtained_epsilon}")

target epsilon 4, obtained epsilon 3.9968389923130356

Checklist

Not able to run all tests locally and unsure if new tests should be added.

facebook-github-bot commented 9 months ago

@facebook-github-bot has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot commented 7 months ago

@Solosneros has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot commented 7 months ago

@facebook-github-bot has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot commented 7 months ago

This pull request has been merged in pytorch/opacus@ad084da9e46b22d6bc341958855a04c00ffb9b1f.