Closed andafterall closed 2 years ago
Use ms.nn.Normalization instead of torch.nn.BatchNorm1d: In parameter server architecture, we should use ms.nn.Normalization which handled global exponentially weighted moving average already.
Use ms.nn.Normalization instead of torch.nn.BatchNorm1d: In parameter server architecture, we should use ms.nn.Normalization which handled global exponentially weighted moving average already.