Open hawos opened 3 years ago
Hi @hawos , are the results worse during the arch search or during the architecture evaluation?
The architecture search you ran with those parameters will output 800 different dags and a estimate of their performance (it should be low <70%). But those architectures are only trained for 128 SGD steps, after you evaluated the 800 architectures you should select the best performing dag and retrain it using the CIFAR-10 NAS code.
Let me know if that doesn't work, Felipe
Hi, I've been trying to reproduce the NAS results but I'm not really getting there. I've trained a generator on CIFAR10 for 2000 iterations and then ran the architecture search with following hyperparameters: `{ "noise_size": 128, "inner_loop_init_lr": 0.02, "lr": 0.0, "final_relative_lr": 1e-1, "inner_loop_init_momentum": 0.5, "generator_batch_size": 128, "meta_optimizer": "adam", "adam_beta1": 0.9, "adam_beta2": 0.9, "adam_epsilon": 1e-5, "meta_batch_size": 256, "validation_learner_type": "enas", "use_intermediate_losses": 16, "num_inner_iterations": 128, "num_meta_iterations": 800, "dataset": "CIFAR10", "learner_type": "enas", "step_by_step_validation": false, "randomize_width": false, "use_dataset_augmentation": 1, "generator_type": "cgtn",
}`
It works but my results are noticeably worse than yours, am I missing something?