How is the normalization of the error term performed?

spott commented 6 years ago

https://github.com/facebookresearch/odin/blob/64e97962ccaed1fe979f43a089c0feb4d8b002fd/code/calData.py#L68-L70

Is this the average value of all pixels for each channel for all images? Or something else?

nicolasj92 commented 5 years ago

I would also like to know this. Why do you add a portion of the normalized gradient and not take the sign as it is stated in the paper?

jwarley commented 5 years ago

@spott It's dividing by the standard deviation of the pixel values for each channel, since that's how the images are preprocessed (see line 36 of cal.py). I'm not really sure why they normalize the gradient. Should be equivalent to just picking a different value of epsilon, right? Does the normalized epsilon have some nicer physcial interpretation?

@nicolasj92 If you're asking about the line tempInputs = torch.add(inputs.data, -noiseMagnitude1, gradient) it is subtracting epsilon times the sign of the gradient as advertised in the paper. I was fooled by the counterintuitive function signature as well (see https://pytorch.org/docs/0.3.1/torch.html?highlight=torch%20add#torch.add).

Would be interested to hear the authors' comments on why the normalization step is preferred.

monney commented 5 years ago

I’m also wondering this. I think it’s so you can always search over the same region of epsilons since it’s normalized. But I don’t fully understand why they are normalizing it in that fashion since that’s an std from a different distribution. Later in the code the Gaussian noise input undergoes the same transformations the original image set goes through but this won’t give it the standard normal distribution so I’m not sure why those values are still used.

facebookresearch / odin

How is the normalization of the error term performed? #5