GT-Vision-Lab / VQA_LSTM_CNN

Train a deeper LSTM and normalized CNN Visual Question Answering model. This current code can get 58.16 on OpenEnded and 63.09 on Multiple-Choice on test-standard.
376 stars 133 forks source link

no clue #32

Open jijibn opened 7 years ago

jijibn commented 7 years ago

im=im*255; im2=im:clone() im2[{{3},{},{}}]=im[{{1},{},{}}]-123.68 im2[{{2},{},{}}]=im[{{2},{},{}}]-116.779 im2[{{1},{},{}}]=im[{{3},{},{}}]-103.939

hello, could someone plz explain to me this part of the code

utkarshojha commented 6 years ago

That is one of the image preprocessing step usually done, especially before feeding images into most of the convolutional neural networks people use these days. This may not make much intuitive sense right now, but such preprocessing helps network undergo a better training process