Closed SpinachR closed 5 years ago
Since we are using the batch size as 1, we turned off the batch norm updates.
@wasidennis In case that we would like to use Batch normalization, how can we switch it on?
@alphjheon In the model, you can comment out the lines regarding setting BN layers "not required gradients", like the following:
for i in self.bn3.parameters(): i.requires_grad = False
@SpinachR My baseline ResNet-101 trained on GTA only is also around ~29%. The paper however reports ~35%. Can you tell steps to reproduce source only performance?
The sourceonly resnet-101 performance is quite higher than mine, which is only ~27.
I have noticed that the parameters in batch_norm layer are not trained. Do we need to update the batch statistics during training or just use the statistics load from the pre-trained resnet?
Could you provide more details based on resnet-101 model? @wasidennis