Closed PkuDavidGuan closed 4 years ago
Do you use KD?
OK, I did not add KD. Thanks for that.
But the results of other STOA may not use KD. For example, I didn't find the KD setting in FPGM. Is that a fair comparison?
That is a good question and has been discussed in the paper. Other pruning papers use trained unpruning models and implicitly-or-explicitly transfer the knowledge of this unpruned model to the pruned model. In this paper, we choose KD to transfer knowledge due to its simplicity, and our TAS is orthogonal to such knowledge transfer strategies. You can use the transfer strategy used in FPGM on our TAS, but need to modify some codes.
Sorry to bother you, my another question is whether you use cutout
in CIFAR experiments? When I retrained the unpruned resnet56/100 on CIFAR10/100, I also get an accuracy drop compared with your paper. I could only reproduce the result when adding cutout
into data augmentation.
No, we did not use cutout for data augmentation. We set cutout_length=-1 in https://github.com/D-X-Y/AutoDL-Projects/blob/master/exps/basic-main.py#L32, which will disable cutout. You can check some of our original log at here: https://drive.google.com/open?id=1AWq5dQ3ilHQtSOFl0Jvlda_wk0PKe1Tc
Dear D-X-Y, Is the config the same with NIPS2019 paper: https://github.com/D-X-Y/AutoDL-Projects/blob/bc405a2e06272355db5fc173c832e7807bb558c6/configs/NeurIPS-2019/C010-ResNet110.config#L10