In the readme file:
To train the i3d Non-local Networks with longer clips (32-frame input), we first need to obtain the model trained from "run_i3d_baseline_400k.sh" as a pre-trained model. Then we convert the Batch Normalization layers into Affine layers by running: python modify_caffe2_ftvideo.py xxxx
what is an affine layer? Is it a conv layer without batch normalizaion? Does the model with affine layer has more parameters than the one with BN layer arrording the gpu memory taking? Is there some other operation in affine layer?
well, I wonder that too. So, is an affine layer an alternative of batch normalization layer? If so, how does it work and is there any research papers about that?
In the readme file: To train the i3d Non-local Networks with longer clips (32-frame input), we first need to obtain the model trained from "run_i3d_baseline_400k.sh" as a pre-trained model. Then we convert the Batch Normalization layers into Affine layers by running: python modify_caffe2_ftvideo.py xxxx
what is an affine layer? Is it a conv layer without batch normalizaion? Does the model with affine layer has more parameters than the one with BN layer arrording the gpu memory taking? Is there some other operation in affine layer?