Closed yxqlwl closed 7 years ago
first dimension of input is image number. Basically it's just image1 or image2. So it makes sense to add the same mean. There is actually 3 different means and std, one for each chanel (red green and blue)
inputs dimension are organized like this :
ImageNb x ChannelNb x width x height
the snippet would do the same job and maybe simpler to understand :
for i=1,loadSize[1] do -- channels
if mean then inputs[{{},{i}}]:add(-mean[i]) end --construct a tensor composed with all the pixels from a given chanel, regardless it is from picture 1 or picture 2 and shift it
if std then inputs[{{},{i}}]:div(std[i]) end --idem
end
Thanks for you advice. And for the FlownetS model, normalization step is only performed for the input images and not for the following layers(unlike the FlownetSBN model), right?
That's right. It's not exactly equivalent because we copute mean and std values for inputs once and for all before training, whereas BN layers explicitly compute batch mean and std to remove it.
So you would not have the same results if a BN layers was added directly at the input of the network instead of just normalizing it beforehand. I believe it would cause some overfiting problems, like you can see in the graph i provided for FlowNetSBN accuracy in the README
nice. Thank you very much!
Sorry for troubling you. In the definition of hook function in donkey.lua, there is a normalization step at the end:
That means input[{{1},{i},{},{}}] and input[{{2},{i},{},{}}] will add the same -mean[i]. I am not sure whether I'm clear about this, but maybe these two input channels ( input[{{1},{i},{},{}}] and input[{{2},{i},{},{}}] ) are supposed to add different mean values?
Thanks!