google-research / mixmatch

Apache License 2.0
1.13k stars 163 forks source link

The architecture for Cifar-100 dataset #12

Closed wang3702 closed 5 years ago

wang3702 commented 5 years ago

Could you please tell me the exact architecture for cifar-100? It stated that it has increased the width of the wide-resnet. What is the increased width of the new architecture. For the 28*2 architecture, it increased the number of filters from 16 to 32 to 64 to 128. What's your meaning that "it has 135 filters per layer"? Does it mean all convolution layer has 135 layers? Thanks a lot!

david-berthelot commented 5 years ago

See how to reproduce tables in runs folder: https://github.com/google-research/mixmatch/blob/master/runs/table1.sh

Basically we pass the option --filters=135 (instead of 32), as you can see in libml/models.py the layers grow proportionally to that number in the resnet architecture (and in other architectures too).

wang3702 commented 5 years ago

That's to say, it starts from 135, then 270, then 405 and 540(for 4 blocks). If my understanding is correctly, I will close the issue. Thanks!

wang3702 commented 5 years ago

Sorry for my incorrect calculation. Here the number of filters should from 135,270,540,1080. Is that Correct?

wang3702 commented 5 years ago

I have checked that: that's around Total params: 104.6074160000M. In your paper, it's around 26 million parameter. Could you please explain more detailed?

wang3702 commented 5 years ago

I figured out: it's 16, 135,270,540. Total number of params: 26.1161800000M Is my understanding correct?

david-berthelot commented 5 years ago

I don't recall the exact number, in the paper we quoted 26M. When you run the code you can see the size of the network parameters layer by layer before training start.