cannot reproduce the reported accuracy in the paper.

iboing / ISTA-NAS

released code for the paper: ISTA-NAS: Efficient and Consistent Neural Architecture Search by Sparse Coding

31 stars 8 forks source link

cannot reproduce the reported accuracy in the paper. #2

Open cxxgtxy opened 3 years ago

cxxgtxy commented 3 years ago

Thanks for releasing the code. I try to reproduce the CIFAR10 result from scratch according to your guidance (cutout enabled): python ./tools/evaluation.py --auxiliary --cutout --onestage --arch ISTA_onestage However, the accuracy of the model on CIFAR-10 is 97.3% after training for 600 epochs, which is lower than 97.64% (2.36±0.06 error rate) in your paper. Can you provide the training logs to help me find out the gap?

This is one training log using your code. test_one_stage.log

Thanks again.

cxxgtxy commented 3 years ago

I rerun it using different seeds. The best one is 97.4%, which is still lower than 97.64%. After all, the reported 97.64% is the best top1 in DARTS papers so far. I am eager to reproduce such a good result.

By the way, I still cannot reproduce the reported ImageNet result (76.0%) using your code (mine is 75.6%). I would appreciate if you release the log to help me find out what's wrong. Thanks!

iboing commented 3 years ago

Thanks for your attention. This is the training log of the experiment in our paper. training log cifar.log

I will check the released code recently.

cxxgtxy commented 3 years ago

Thanks! This is the log file for another seed 19 (97.4%). The only difference is passing a different seed s=19 test_one_stage_s19.log

Moreover, I would appreciate if you can release the training log on ImageNet (76.0%)

iboing commented 3 years ago

One stage Imgnet resume.log One stage Imgnet.log One stage C10.log

Hi, the following files are some of our logs of the original evaluation on ImageNet.

I have checked the code but did not find any bug.

cxxgtxy commented 3 years ago

Thanks! However, the remaining probability is the random seed. Can you provide more logs (different seeds) about the model searched on CIFAR10? I have run the released training script on CIFAR10 using eight seeds but none of them exceeds 97.5%. Several classmates of mine face with the same issue.

skeletondyh commented 3 years ago

I met the same issue as @cxxgtxy. I ran the command for evaluating one-stage ISTA-NAS on CIFAR10 following README several times, but the accuracies were lower than 97.5%

tianyic commented 1 year ago

Thanks for the great work! I tried to reproduce the accuracy that reported in the paper on CIFAR10. But I only obtained around 93-94% accuracy via running

python ./tools/evaluation.py --auxiliary --cutout --onestage --arch ISTA_onestage

Any idea how to recap the significant accuracy gap? Thanks! My experiment setting is A100 server torch 1.13.