andreped / GradientAccumulator

:dart: Accumulated Gradients for TensorFlow 2
https://gradientaccumulator.readthedocs.io/
MIT License
51 stars 11 forks source link

Replacing AccumBatchNormalization not working as intended #108

Closed andreped closed 1 year ago

andreped commented 1 year ago

Describe the bug If you attempt to replace all BatchNormalization layers with AccumBatchNormalization layers it works fine (mechanically).

However, if you attempt to set the weights of the new layer using the old layer, the shapes do not match.

There are also no unit tests that actually verifies that the performance is the same after replacement. This should be added - both in terms of inference and training.

Expected behavior The AccumBatchNormalization layer should work as a drop-in replacement for the existing BatchNormalization layer.

andreped commented 1 year ago

Managed to get a solution working locally just now. Will make PR later.

andreped commented 1 year ago

Fixed in d3fce336879cedb52e4fefaf32850c0aec714aac