Open Yuancheng-Xu opened 1 year ago
I was also wondering about this. It seems the 32x32 size of CIFAR-10 is incompatible with this model due to the down-sampling layers.
@Yuancheng-Xu It seems like it can. The downsampling layers should be set to a smaller kernel and stride size (2 and 2 respectively). Without this, the output of the downsampling layers is effectively the same size as the kernel. In addition, you might want to choose a smaller kernel and padding size for the Block convolutional layers Here's a notebook showing the training progress https://juliusruseckas.github.io/ml/convnext-cifar10.html
@Yuancheng-Xu I managed to get accuracy to 87% by making a few changes to the code in the link above. Basic changes are mentioned in this repository https://github.com/shamikbose/Fujitsu_Assessment Main changes were as follows:
Thanks a lot!
Hey @shamikbose, I tried training the ImageNet100 dataset for custom input_size = 32, but the accuracy that I am getting is too low. What could I change in the architecture (I tried with making the kernel and stride small)? Any other approach that might help me to get good accuracy?
@iamsh4shank The parameters used for ImageNet100 are mentioned in the paper. You should be able to reproduce it using those values.
Actually ig it was for input_size 224 but on changing it to 32 I get accuracy really low
With image size 32, try the parameters mentioned here https://github.com/facebookresearch/ConvNeXt/issues/134#issuecomment-1534986992
I did try changing the Conv layer (https://github.com/facebookresearch/ConvNeXt/blob/main/models/convnext.py#L28) with kernel size 3 and padding 1. Also, I changed the downsampling layer (https://github.com/facebookresearch/ConvNeXt/blob/main/models/convnext.py#L74) with kernel size 2 and stride 2. It did not change the accuracy much. I am getting test accuracy like 4-5 percent
Hi,
I am trying to train a convext on CIFAR-10 for a research project that doesn't allow using BN. I use the following configuration:
And the accuracy is only 75% percent (standard ResNet18 is about 93%). If I change the optimizer from AdamW to SGD, the best accuracy actually drops to below 50%. If I use the default input size 224, the accuracy is 84%, still significantly low.
Can ConvNeXt work on CIFAR10 without fine-tuning from a pretrained model? Could you provide a recommended set of hyper parameters for CIFAR10 (that should be robust to different types of optimizers and without mix-up and cutmix)?
Also I have another question on fine-tuning on CIFAR10: it seems that in the colab file the input_size is the default 224. However CIFAR10 image is 3232. Does this mean that in the data preparation stage the image will be padded to 224 224?
Thank you!