Open mhdiamian opened 4 years ago
Hello. Yes the basic algorithm of Abadi et al. is supported. You can get that by setting leaving num_microbatches unspecified so it will default to the size of the minibatch. However this is very slow because it means every example must be processed independently. Microbatches (introduced here and discussed here) let you get back some of the efficiency of processing examples in minibatches.
Thank you for reply. I really appreciate it. I get your point. Reservation is that it seems to me that the idea of clipping gradient per individual (input sample) is not maintained in the code. Is this the case?
Please see the comment here. Clipping gradient per example is supported by leaving num_microbatches=None.
You are absolutely right. In fact, when the number of minibatch is different than that of microbatch, the algorithm is not supposed to look at individual gradients within a microbatch (that's why it takes mean). I thought otherwise. Now, it’s clear, thank to you. Thank you so much. I really really appreciate it. It helped.
Hi. First off, I presume that the code, particularly "dp_optimizer.py" implements the original algorithm proposed by Abadai et al, 2016 (https://arxiv.org/abs/1607.00133). If this is not the case, correct me please. If so, it is meant to clip gradient corresponding to each individual input sample. However, in the code, "def process_microbatch", right in the very beginning it takes mean over individual gradients in the microbatch each of which is corresponding to one input sample so that the impact of individual input samples is canceled out. So, it seems the Abadi algorithm is not followed. Is this true?