D-X-Y / AutoDL-Projects

Automated deep learning algorithms implemented in PyTorch.
MIT License
1.57k stars 284 forks source link

Disparities between NASBench-201 and NATS-Bench papers #96

Closed Mirofil closed 3 years ago

Mirofil commented 3 years ago

Hello,

I noticed the final accuracies of searched models in the NATS-Bench paper are generally quite a bit higher than in the original NASBench-201 paper especially for the weight sharing methods. I assume that is because the hyperparameters for those were changed (I see there is a note about that in the second paper)?

Furthermore, the NASBench 201 paper has Table 6 in the Appendix which shows ~90% correlation of the 12-epochs training protocol with the performance of the 200-epochs training. However, when I try to reproduce it on NATS-Bench, I get only ~80%. Do you know if that is intended?

Thanks in advance

D-X-Y commented 3 years ago

Thanks for your interest.

For the first question, there are three reasons:

For the second question, I just have a try on my side, it is about 91%. Please see my demo code here: https://github.com/D-X-Y/AutoDL-Projects/blob/main/notebooks/NATS-Bench/issue-96.ipynb

Let me know if you have any questionos.

Mirofil commented 3 years ago

Hello,

thanks for the detailed answer. It appears the issue is that I was tracking Spearman correlation rather than Pearson - like that, the correlation is slightly lower at around 80%. But I was able to get the 90+% with Pearson as you did

Thanks!

D-X-Y commented 3 years ago

Good to know that! Yeap, a ranking correlation (e.g., Spearman) is more suitable for this case, thus we have switched to Kendall rank correlation coefficient in NATS-Bench :)