Open BarCodeReader opened 4 years ago
also, I noticed that in your VGG code, you set bias to False
def _make_layers(self, output_channels, layer_num):
layers = []
while layer_num:
layers.append(
BasicConv(
self.input_channels,
output_channels,
kernel_size=3,
padding=1,
bias=False
)
)
Then why no weight decay on bias is still improve the accuracy by 4%?
bias includes gamma, beta in BN layer, bias in fc and in conv layer, though the latter is set to zero
Hi,
Thanks for your implementation and repo.
I tested on ResNet for CIFAR100 and seems Label Smoothing, No-bias-decay does not improve the result.
BTW, I disable all the image augmentation for both train and test, only use normalize and purely test on above mentioned tricks.
If you also tested on ResNet, say ResNet56, and have different results, please let me know.
Really thanks.