Closed zplizzi closed 5 years ago
BTW, I think here should be torch.randn
not torch.rand
.
This line is only there to undo the discretization of the image (that's also why we use torch.rand
, not torch.randn
). As it is very small (< the distance between to color values), I don't think it has much of a difference, but I agree that you could interpret it as some sort of instance noise.
From a theory point of view (eq. 6 and eq. 8 in the paper), a parameter gamma for your R1 regularizer roughly corresponds to instance noise with standard deviation sqrt(2 * gamma), i.e. gamma=10 corresponds to instance noise with standard deviation 4.5 which is much bigger than 1/128 (and not really feasible for images - that's why we use regularization instead).
Hi @LMescheder, thank you for your clarification!
Makes complete sense, thanks!
I noticed in the code for
inputs.py
that there appears to be instance noise applied universally to all training examples (along with other data augmentation). Was this and the other data augmentation used for all experimental results in the paper? I don't remember it being mentioned. I was surprised to see this as instance noise was one of the regularization approaches you were comparing eg the R1 approaches to - I didn't realize both were being used simultaneously. Or is the scale of the noise injected here (uniform on [0-1/128)) much lower than required for use as a regularizer?