Open chrismessenger opened 3 years ago
The name given to the hidden layer batchnorm is not correct. It is being labelled with the same name as used in the conversation layer batchnorm.
If this is a trainable parameter (which I'm unsure about) then this will be a bug.
The name given to the hidden layer batchnorm is not correct. It is being labelled with the same name as used in the conversation layer batchnorm.
If this is a trainable parameter (which I'm unsure about) then this will be a bug.