Normalization of input - Githubissues

LeoniekevandenBulk commented 5 years ago

I am trying to use the pretrained models in pytorch and am therefore writing a replacement for the VideoInput layer, but I am not sure on the exact normalization that was used on the input. And is it scaled between -1 and 1 or 0 and 1 at the end? It would be very helpful if this could be elaborated on.

Thanks!

keunhong commented 5 years ago

I've been digging through the code and can't seem to find a clear answer for this. How are the inputs normalized? What mean/std are used? Are they scaled between -1 and 1?

Edit: judging from the code, it seems like the default inputs to the undocumented VideoInput op here (https://github.com/pytorch/pytorch/blob/8e9692df2787b64f879e83db617745b810bd7ef2/caffe2/video/video_input_op.h) subtracts 128 from each channel but doesn't divide. This would put the inputs between -128 and +127. Is this correct?

dutran commented 5 years ago

we normalize it to zero mean and unit variance. Here are the mean/std https://github.com/pytorch/pytorch/blob/master/caffe2/video/video_input_op.h#L374-L378

facebookresearch / VMZ

Normalization of input #60