tensorflow / privacy

Library for training machine learning models with privacy for training data
Apache License 2.0
1.94k stars 453 forks source link

Following the original DPSGD algorithm (Abadi et al, 2016) #135

Open mhdiamian opened 4 years ago

mhdiamian commented 4 years ago

Hi. First off, I presume that the code, particularly "dp_optimizer.py" implements the original algorithm proposed by Abadai et al, 2016 (https://arxiv.org/abs/1607.00133). If this is not the case, correct me please. If so, it is meant to clip gradient corresponding to each individual input sample. However, in the code, "def process_microbatch", right in the very beginning it takes mean over individual gradients in the microbatch each of which is corresponding to one input sample so that the impact of individual input samples is canceled out. So, it seems the Abadi algorithm is not followed. Is this true?

galenmandrew commented 4 years ago

Hello. Yes the basic algorithm of Abadi et al. is supported. You can get that by setting leaving num_microbatches unspecified so it will default to the size of the minibatch. However this is very slow because it means every example must be processed independently. Microbatches (introduced here and discussed here) let you get back some of the efficiency of processing examples in minibatches.

mhdiamian commented 4 years ago

Thank you for reply. I really appreciate it. I get your point. Reservation is that it seems to me that the idea of clipping gradient per individual (input sample) is not maintained in the code. Is this the case?

galenmandrew commented 4 years ago

Please see the comment here. Clipping gradient per example is supported by leaving num_microbatches=None.

mhdiamian commented 4 years ago

You are absolutely right. In fact, when the number of minibatch is different than that of microbatch, the algorithm is not supposed to look at individual gradients within a microbatch (that's why it takes mean). I thought otherwise. Now, it’s clear, thank to you. Thank you so much. I really really appreciate it. It helped.