clovaai / CutMix-PyTorch

Official Pytorch implementation of CutMix regularizer
MIT License
1.22k stars 159 forks source link

Scripts for mentioned experiments #40

Open xuuuluuu opened 3 years ago

xuuuluuu commented 3 years ago

Hi, thanks for this great repo!

I am new to this area and I have a few questions regarding the experiments:

  1. Do you still have the scripts (or hyperparameter lists) for the experiments in the paper. For example, the ResNet 50 and ResNet 101 ImageNet dataset; PyramidNet-200 and PyramidNet-110 for Cifars.
  2. When I ran ResNet50 on Cifar 10, the number of parameters is 0.5M. while the number of parameters of ResNet50 on ImageNet is 25M. (This is the command I used: python train.py --net_type resnet --dataset cifar10 --depth 50 --alpha 240 --batch_size 64 --lr 0.25 --expname ResNet50 --epochs 300 --beta -1.0 --cutmix_prob 0.5 --no-verbose)
  3. How should I conduct transfer learning experiments?

Hope you can help me with this.

Thanks a lot!

hellbell commented 3 years ago
  1. The script for ResNet50 on ImageNet is in README. Same as for ResNet101 on Imagnet training.
    python train.py \
    --net_type resnet \
    --dataset imagenet \
    --batch_size 256 \
    --lr 0.1 \
    --depth 50 \
    --epochs 300 \
    --expname ResNet50 \
    -j 40 \
    --beta 1.0 \
    --cutmix_prob 1.0 \
    --no-verbose

    I'm not 100% sure, but the PyramidNet (110 or 200) training on cifar also be same as in the README.

    python train.py \
    --net_type pyramidnet \
    --dataset cifar100 \
    --depth 200 \
    --alpha 240 \
    --batch_size 64 \
    --lr 0.25 \
    --expname PyraNet200 \
    --epochs 300 \
    --beta 1.0 \
    --cutmix_prob 0.5 \
    --no-verbose
  2. This is because the structure of ResNet between cifar and ImageNet datasets is different. See https://github.com/clovaai/CutMix-PyTorch/blob/master/resnet.py#L87-L121
  3. First you train your pretrained models, or just download our pretrained models (https://github.com/clovaai/CutMix-PyTorch/blob/master/README.md#experiments), and fine-tune it on your own downstream datasets.
xuuuluuu commented 3 years ago

Thanks a lot for the reply.

For the 3rd question, do you still have the code for fine-tuning the pre-trained models on the downstream datasets? I cannot find it in the current repo.

hellbell commented 3 years ago

This repo does not have downstream training/testing code. The instructions to train/test on downstream tasks are described in our paper.

xuuuluuu commented 2 years ago

Hi, have you experienced any performance gap between using one and two GPUS?