query about freeze BN from beginning

princeton-vl / RAFT-Stereo

MIT License

721 stars 139 forks source link

query about freeze BN from beginning #58

Open Torment123 opened 2 years ago

Torment123 commented 2 years ago

Hi,
I found that in the training all the batchnorm2d layer are frozen from the very beginning, indicating that these set of parameters are randomly initialized without any learning, I'm a bit confused of why this step is needed? Thanks.

lahavlipson commented 2 years ago

I believe batchnorm2d is initialized with mean 0 and variance 1 by default. We freeze the batchnorm2d parameters because empirically this worked well, although the difference was negligible.

Torment123 commented 2 years ago

Thanks for your fast response.

Torment123 commented 2 years ago

Thanks for your fast response.