Generated Archs is empty

Multi-Objective-NAS / self-supervised-nas

Official implementation of the paper "Pretraining Neural Architecture Search Controllers with Locality-based Self-Supervised Learning" (NeurIPSW 2020)

5 stars 0 forks source link

Generated Archs is empty #27

Open ArjunSridhar1 opened 3 years ago

ArjunSridhar1 commented 3 years ago

The is_valid check fails for all candidates produced in train.py resulting in an empty generated architectures list. Are there any things that could be causing this mainly: How is the encoder model from pretrain loaded into train? Is there some value that should be used in the .yaml for pretrain_model_path?

Any help would be greatly appreciated! Thank you so much for your time!

juice500ml commented 2 years ago

Hello! Thanks for your attention to our research! In our experiments (https://arxiv.org/abs/2103.08157), we haven't tried with randomly initialized encoder yet. However, I'm guess your problem would be fixed if you pretrained the encoder with our method, and then provide the pretrained encoder weights to pretrain_model_path. You can pretrain the encoder via CUDA_VISIBLE_DEVICES={device_index} python3 pretrain.py experiment=AngularLoss, or whatever loss you want to use.

bhavna-gopal commented 2 years ago

Can you please give us a step by step procedure?

When we run CUDA_VISIBLE_DEVICES={device_index} python3 pretrain.py experiment=AngularLoss we generate some weights in the output folder. We don't see anything with "encoder" in it. We see "embedder", "trunk" and "trunk-optimizer". There are six h5 files in total - each of the above names with "0" and "1" .
Passing in the path for embedder does not work into train.yaml. What should we be passing in here?
How do we run train.py? I am currently running "CUDA_VISIBLE_DEVICES=4 python3 train.py" sometimes I still see that candidates is empty and sometimes it is not. Is this expected? How do I generate the networks generated// search space as well as the final network selected?

juice500ml commented 2 years ago

Hello, @bhavna-gopal , sorry for the late reply.

"0" and "1" means the epoch index (i.e. we pretrain for 2 epochs). We don't use the embedder, we only use the trunk.
You have to pass the trunk path, ex. python3 train.py pretrained_model_path=/path/to/your/weights/trunk-1.h5.
Without any pretraining, many candidates come out empty (hence showing the importance of the pretraining step!). The default search space is NAS-Bench-101, where you can see the details here: https://github.com/google-research/nasbench https://arxiv.org/abs/1902.09635 .

I'm closing the issue #28 , let's continue communicating here!