Closed ghost closed 6 years ago
@zeno40 maybe - I haven't tried.
The reason I ask is because I trained a WRN-16-6 with 96 channels in the first convolution instead of 16 and trained it with the same training scheme using dropout of 0.3 and L2 penalty of 0.0005 with ReLU activation and local mean/std normalization and reached a test error of 4.21 Clearly not state-of-the-art but very close to some wider and deeper wide resnets.
Hi,
Is it possible I get a better accuracy in a wide resnet by using more output channels in the first convolution layer? Like using 64 or 128 as the other convolutions are getting.
thanks