kcyu2014 / eval-nas

PyTorch Code for "Evaluating the search phase of Neural Architecture Search" @ ICLR 2020
MIT License
49 stars 7 forks source link

How are SPOS and FairNAS experiments actually done? #4

Open serser opened 3 years ago

serser commented 3 years ago

Hello, thanks for the impressive evaluation work!

However, I am confused about the search space used for the experiments (in the supplementary of your ICLR paper) on one-shot NAS algorithms. As both handle MBV2-like block-wise search space, it requires some adjustment to apply them on cell-based NAS-Bench-101.

According to README.md at the root of the repository, it says bash scripts/nasbench-oneshot-search.sh search is used for one-shot evaluation. It seems to use the full NAS-Bench-101 search space. I cannot run into the very detail of the implementation, could you show how is it exactly done for single-path sampling? Just explain it in a nutshell?

Plus, in search_policies/cnn/search_space/README.MD a linearized search space is proposed for these methods, but it is nowhere used in any parts of code.

I am grateful for any help you can provide.

kcyu2014 commented 3 years ago

Hi serser,

Thanks for your interest! Yes, we did adapt some one-shot NAS algorithms for them to work on NASBench 101.

For the single-path one-shot type approach, please check this file for the logic. The random_sampler function implements this, essentially, random sample one architecture from the search space and train it.

As for the linearized search space, that's not relevant to the ICLR paper, it's an experiment I tried to let NASBench-101 mimic mobile-net like search space.

Let me know if you would like to know more.

P.S., I saw an email with more comments, but cannot find the reply on GitHub issue page yet. Is it because you delete the message?

Best, Kaicheng

kcyu2014 commented 3 years ago

To your other question, FairNAS on NASBench-101 is indeed a quick prototype. Your understanding is exact, for each topology sampled, I mutate the operation as the original FairNAS policy. This is also the most direct application of FairNAS paper. Additional treatment should be considered as a FairNAS-v2 that improves the original paper.

By the way, your proposal sounds interesting but I have some other comments. FairNAS points that you should train the weights in a fair way, that's why they always sample all the weights at one iteration. In NASBench-101, different topology does not infect the weights because unlike DARTS space where weights lie on the edge, the weights of NB101 lies on the node. That says, if at one iteration, we sample a full topology (no matter what's the edge connection), all the weights will be trained by the current operation sampler.

serser commented 3 years ago

@kcyu2014 Thanks for the detailed reply. I deleted the second question by mistake but I don't know how to find it back. I see that once the topology is chosen within the current batch, the operation sampler can be done in a fair way. But since the topology is sampled every batch, do really the weights easily transfer to the new topology?

I've run SPOS vs. FairNAS with 50 epochs, I see that both supernets are not trained properly. This would make supernets fail to evaluate submodels. Just to double-check if I am doing all right, could you still find the log referring to supernet validation accuracy as well as sampled models‘ prediction vs. NB101 ground-truth?

kcyu2014 commented 3 years ago

50 epochs are too short. In my next work (how to train your super net), I had a full comparison of fairnas v.s. Spos. It’s slightly better even in the current formulation.

Cheers, Kaicheng

在 16.06.2021,09:47,serser @.***> 写道:

 @kcyu2014 Thanks for the detailed reply. I deleted the second question by mistake but I don't know how to find it back. I see that once the topology is chosen within the current batch, the operation sampler can be done in a fair way. But since the topology is sampled every batch, do really the weights easily transfer to the new topology?

I've run SPOS vs. FairNAS with 50 epochs, I see that both supernets are not trained properly. This would make supernets fail to evaluate submodels. Just to double-check if I am doing all right, could you still find the log referring to supernet validation accuracy as well as predicted samples models vs. NB101 ground-truth?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

serser commented 3 years ago

Thanks. But my question still remains unanswered, and could you please give me the searching logs?

kcyu2014 commented 3 years ago

According to the other work, super-net accuracy (your question) does not correlate with the prediction power. Please check this paper (https://arxiv.org/pdf/2003.04276.pdf) page 8 Figure 9.

Regarding the log for the super-net validation v.s. NB ground-truth I cannot find them since they are stored on a old server.

Besides, to your first question, I do not see why topology sampling will make the weights transfer hard. This is exactly following randomNAS (or single-path one shot) approach on NB101.