snu-mllab / PuzzleMix

Official PyTorch implementation of "Puzzle Mix: Exploiting Saliency and Local Statistics for Optimal Mixup" (ICML'20)
MIT License
157 stars 17 forks source link

WRN28_10 accuracy jumps 69 to 79% #11

Closed khawar-islam closed 1 year ago

khawar-islam commented 1 year ago

Dear @Janghyun1230,

I am training simple WRN28-10 to reproduce your paper results mentioned in Table 2. After 120 epoch, accuracy is directly jumped to 69 to 79%. Is that correct? image

python3 main.py --arch wrn28_10 --dataset cifar100 --epochs 200 --schedule 120 170 --learning_rate 0.2

==>>[2023-01-16 05시 34분 25초] [Epoch=119/200] [Need: 02:04:34] [learning_rate=0.2000] [Best : Accuracy=69.12, Error=30.88]
  **Train** Prec@1 87.300 Prec@5 98.022 Error@1 12.700
  **Test** Prec@1 65.700 Prec@5 88.650 Error@1 34.300 Loss: 1.410 

==>>[2023-01-16 05시 35분 56초] [Epoch=120/200] [Need: 02:03:01] [learning_rate=0.0200] [Best : Accuracy=69.12, Error=30.88]
  **Train** Prec@1 95.604 Prec@5 99.456 Error@1 4.396
  **Test** Prec@1 79.200 Prec@5 94.600 Error@1 20.800 Loss: 0.837 

==>>[2023-01-16 05시 37분 30초] [Epoch=121/200] [Need: 02:01:29] [learning_rate=0.0200] [Best : Accuracy=79.20, Error=20.80]
  **Train** Prec@1 97.876 Prec@5 99.696 Error@1 2.124
  **Test** Prec@1 79.180 Prec@5 94.780 Error@1 20.820 Loss: 0.831 

==>>[2023-01-16 05시 39분 01초] [Epoch=122/200] [Need: 01:59:57] [learning_rate=0.0200] [Best : Accuracy=79.20, Error=20.80]
  **Train** Prec@1 98.458 Prec@5 99.760 Error@1 1.542
  **Test** Prec@1 79.330 Prec@5 94.700 Error@1 20.670 Loss: 0.832 

==>>[2023-01-16 05시 40분 34초] [Epoch=123/200] [Need: 01:58:24] [learning_rate=0.0200] [Best : Accuracy=79.33, Error=20.67]
  **Train** Prec@1 98.846 Prec@5 99.822 Error@1 1.154
  **Test** Prec@1 79.510 Prec@5 94.640 Error@1 20.490 Loss: 0.838 
Janghyun1230 commented 1 year ago

Hello, yes it's right. It's due to the step learning rate scheduling at 120 epoch. The vanilla training has the same accuracy increase patterns.

The smooth learning rate scheduling, such as cosine or polynimial, results in the smooth increase in accuracies, however it will have the similar converged performance.