Open Jxu-Thu opened 3 years ago
Good catch. Thank you for pointing it out. Actually, the only difference between "FEW-SHOT-SUPERNET.config" and "ONE-SHOT-SUPERNET.config" is the training epoch. But this script is used for evaluation. Therefore, using the FEW-SHOT-SUPERNET.config should be worked as desired. But to eliminate the confusion, I will change this line to ONE-SHOT-SUPERNET.config later. Thank you again for your attention.
I try to reproduce your one-shpt results on nas-bench 201. By running the code https://github.com/aoiang/few-shot-NAS/blob/main/Few-Shot_NasBench201/supernet/one-shot/run.sh
and eval https://github.com/aoiang/few-shot-NAS/blob/main/Few-Shot_NasBench201/supernet/one-shot/eval.sh
However, your eval.sh is
for (( c=0; c<=4; c++ )) do OMP_NUM_THREADS=4 python ./exps/supernet/one-shot-supernet_eval.py --save_dir ${save_dir} --max_nodes ${max_nodes} --channel ${channel} --num_cells ${num_cells} --dataset ${dataset} --data_path ${data_path} --search_space_name ${space} --arch_nas_dataset ${benchmark_file} --config_path configs/nas-benchmark/algos/FEW-SHOT-SUPERNET.config --track_running_stats ${BN} --select_num 100 --output_dir ${OUTPUT} --workers 4 --print_freq 200 --rand_seed 0 --edge_op ${c} done
Why using te FEW-SHOT-SUPERNET.config?
I have changed the FEW-SHOT-SUPERNET.config to ONE-SHOT-SUPERNET.config. This is a typo. Thanks again for pointing it out and you can run this script directly. If you have further questions, please let me know.
@aoiang can you please list the "exact step" to reproduce the results? can you please run your codes on your end to ensure everything are okay with those steps? After you will have done that, please update this thread.
Thanks four your kind reply! I run the train.sh and eval.sh, get the result as follows:
urrent test-geno is Structure(4 nodes with |avg_pool_3x3~0|+|avg_pool_3x3~0|avg_pool_3x3~1|+|avg_pool_3x3~0|skip_connect~1|none~2|) evaluate : loss=78.47, accuracy@1=10.04%, accuracy@5=50.98% | current test-geno is Structure(4 nodes with |avg_pool_3x3~0|+|avg_pool_3x3~0|avg_pool_3x3~1|+|avg_pool_3x3~0|skip_connect~1|skip_connect~2|) evaluate : loss=1018350.04, accuracy@1=12.98%, accuracy@5=55.38% | current test-geno is Structure(4 nodes with |avg_pool_3x3~0|+|avg_pool_3x3~0|avg_pool_3x3~1|+|avg_pool_3x3~0|skip_connect~1|nor_conv_1x1~2|) evaluate : loss=122.31, accuracy@1=11.96%, accuracy@5=59.11% | current test-geno is Structure(4 nodes with |avg_pool_3x3~0|+|avg_pool_3x3~0|avg_pool_3x3~1|+|avg_pool_3x3~0|skip_connect~1|nor_conv_3x3~2|) evaluate : loss=115.14, accuracy@1=13.03%, accuracy@5=60.53% | current test-geno is Structure(4 nodes with |avg_pool_3x3~0|+|avg_pool_3x3~0|avg_pool_3x3~1|+|avg_pool_3x3~0|skip_connect~1|avg_pool_3x3~2|) evaluate : loss=1125109.73, accuracy@1=12.93%, accuracy@5=54.39% | current test-geno is Structure(4 nodes with |avg_pool_3x3~0|+|avg_pool_3x3~0|avg_pool_3x3~1|+|avg_pool_3x3~0|nor_conv_1x1~1|none~2|) evaluate : loss=1.50, accuracy@1=53.53%, accuracy@5=93.50% | current test-geno is Structure(4 nodes with |avg_pool_3x3~0|+|avg_pool_3x3~0|avg_pool_3x3~1|+|avg_pool_3x3~0|nor_conv_1x1~1|skip_connect~2|) evaluate : loss=26184.69, accuracy@1=12.32%, accuracy@5=57.85% | current test-geno is Structure(4 nodes with |avg_pool_3x3~0|+|avg_pool_3x3~0|avg_pool_3x3~1|+|avg_pool_3x3~0|nor_conv_1x1~1|nor_conv_1x1~2|) evaluate : loss=3.00, accuracy@1=46.15%, accuracy@5=90.89% | current test-geno is Structure(4 nodes with |avg_pool_3x3~0|+|avg_pool_3x3~0|avg_pool_3x3~1|+|avg_pool_3x3~0|nor_conv_1x1~1|nor_conv_3x3~2|) evaluate : loss=2.81, accuracy@1=50.91%, accuracy@5=92.60% | current test-geno is Structure(4 nodes with |avg_pool_3x3~0|+|avg_pool_3x3~0|avg_pool_3x3~1|+|avg_pool_3x3~0|nor_conv_1x1~1|avg_pool_3x3~2|) evaluate : loss=24581.70, accuracy@1=11.95%, accuracy@5=58.06% | current test-geno is Structure(4 nodes with |avg_pool_3x3~0|+|avg_pool_3x3~0|avg_pool_3x3~1|+|avg_pool_3x3~0|nor_conv_3x3~1|none~2|) evaluate : loss=1.18, accuracy@1=63.35%, accuracy@5=95.75% | current test-geno is Structure(4 nodes with |avg_pool_3x3~0|+|avg_pool_3x3~0|avg_pool_3x3~1|+|avg_pool_3x3~0|nor_conv_3x3~1|skip_connect~2|) evaluate : loss=23631.38, accuracy@1=11.73%, accuracy@5=57.01% | current test-geno is Structure(4 nodes with |avg_pool_3x3~0|+|avg_pool_3x3~0|avg_pool_3x3~1|+|avg_pool_3x3~0|nor_conv_3x3~1|nor_conv_1x1~2|) evaluate : loss=2.63, accuracy@1=52.52%, accuracy@5=93.15% | current test-geno is Structure(4 nodes with |avg_pool_3x3~0|+|avg_pool_3x3~0|avg_pool_3x3~1|+|avg_pool_3x3~0|nor_conv_3x3~1|nor_conv_3x3~2|) evaluate : loss=2.54, accuracy@1=54.93%, accuracy@5=93.94% | current test-geno is Structure(4 nodes with |avg_pool_3x3~0|+|avg_pool_3x3~0|avg_pool_3x3~1|+|avg_pool_3x3~0|nor_conv_3x3~1|avg_pool_3x3~2|) evaluate : loss=21931.14, accuracy@1=11.81%, accuracy@5=57.27% | current test-geno is Structure(4 nodes with |avg_pool_3x3~0|+|avg_pool_3x3~0|avg_pool_3x3~1|+|avg_pool_3x3~0|avg_pool_3x3~1|none~2|) evaluate : loss=86.85, accuracy@1=10.03%, accuracy@5=50.57% | current test-geno is Structure(4 nodes with |avg_pool_3x3~0|+|avg_pool_3x3~0|avg_pool_3x3~1|+|avg_pool_3x3~0|avg_pool_3x3~1|skip_connect~2|) evaluate : loss=1080665.85, accuracy@1=12.91%, accuracy@5=54.70% | current test-geno is Structure(4 nodes with |avg_pool_3x3~0|+|avg_pool_3x3~0|avg_pool_3x3~1|+|avg_pool_3x3~0|avg_pool_3x3~1|nor_conv_1x1~2|) evaluate : loss=109.15, accuracy@1=10.89%, accuracy@5=58.14% | current test-geno is Structure(4 nodes with |avg_pool_3x3~0|+|avg_pool_3x3~0|avg_pool_3x3~1|+|avg_pool_3x3~0|avg_pool_3x3~1|nor_conv_3x3~2|) evaluate : loss=101.58, accuracy@1=12.16%, accuracy@5=60.64% | current test-geno is Structure(4 nodes with |avg_pool_3x3~0|+|avg_pool_3x3~0|avg_pool_3x3~1|+|avg_pool_3x3~0|avg_pool_3x3~1|avg_pool_3x3~2|) evaluate : loss=1165846.05, accuracy@1=13.00%, accuracy@5=54.12% | ..... (a lot of info like these)
However, I DO NOT find the result of Kendall tau? Could you provide the steps to reproduce your ranking accuracy result?
As far as I know, evaluating super-net is just a loop to obtain the proxy acc for each arch in the super-net and calculate the rank acc between the proxy acc and GT acc. So another question is that I cannot understand why the eval script should be run 4 times for the one-shot super-net (not parallel for speedup)?
@aoiang I need you to list ALL your "specific" steps to answer the above question!!
@aoiang I need you to list ALL your "specific" steps to answer the above question!!
Sure, I will double-check my code on my local side and write down every specific step for addressing this issue and reproducing the results. Thank you for your kind reminder.
Thanks four your kind reply! I run the train.sh and eval.sh, get the result as follows:
urrent test-geno is Structure(4 nodes with |avg_pool_3x3~0|+|avg_pool_3x3~0|avg_pool_3x3~1|+|avg_pool_3x3~0|skip_connect~1|none~2|) evaluate : loss=78.47, accuracy@1=10.04%, accuracy@5=50.98% | current test-geno is Structure(4 nodes with |avg_pool_3x3~0|+|avg_pool_3x3~0|avg_pool_3x3~1|+|avg_pool_3x3~0|skip_connect~1|skip_connect~2|) evaluate : loss=1018350.04, accuracy@1=12.98%, accuracy@5=55.38% | current test-geno is Structure(4 nodes with |avg_pool_3x3~0|+|avg_pool_3x3~0|avg_pool_3x3~1|+|avg_pool_3x3~0|skip_connect~1|nor_conv_1x1~2|) evaluate : loss=122.31, accuracy@1=11.96%, accuracy@5=59.11% | current test-geno is Structure(4 nodes with |avg_pool_3x3~0|+|avg_pool_3x3~0|avg_pool_3x3~1|+|avg_pool_3x3~0|skip_connect~1|nor_conv_3x3~2|) evaluate : loss=115.14, accuracy@1=13.03%, accuracy@5=60.53% | current test-geno is Structure(4 nodes with |avg_pool_3x3~0|+|avg_pool_3x3~0|avg_pool_3x3~1|+|avg_pool_3x3~0|skip_connect~1|avg_pool_3x3~2|) evaluate : loss=1125109.73, accuracy@1=12.93%, accuracy@5=54.39% | current test-geno is Structure(4 nodes with |avg_pool_3x3~0|+|avg_pool_3x3~0|avg_pool_3x3~1|+|avg_pool_3x3~0|nor_conv_1x1~1|none~2|) evaluate : loss=1.50, accuracy@1=53.53%, accuracy@5=93.50% | current test-geno is Structure(4 nodes with |avg_pool_3x3~0|+|avg_pool_3x3~0|avg_pool_3x3~1|+|avg_pool_3x3~0|nor_conv_1x1~1|skip_connect~2|) evaluate : loss=26184.69, accuracy@1=12.32%, accuracy@5=57.85% | current test-geno is Structure(4 nodes with |avg_pool_3x3~0|+|avg_pool_3x3~0|avg_pool_3x3~1|+|avg_pool_3x3~0|nor_conv_1x1~1|nor_conv_1x1~2|) evaluate : loss=3.00, accuracy@1=46.15%, accuracy@5=90.89% | current test-geno is Structure(4 nodes with |avg_pool_3x3~0|+|avg_pool_3x3~0|avg_pool_3x3~1|+|avg_pool_3x3~0|nor_conv_1x1~1|nor_conv_3x3~2|) evaluate : loss=2.81, accuracy@1=50.91%, accuracy@5=92.60% | current test-geno is Structure(4 nodes with |avg_pool_3x3~0|+|avg_pool_3x3~0|avg_pool_3x3~1|+|avg_pool_3x3~0|nor_conv_1x1~1|avg_pool_3x3~2|) evaluate : loss=24581.70, accuracy@1=11.95%, accuracy@5=58.06% | current test-geno is Structure(4 nodes with |avg_pool_3x3~0|+|avg_pool_3x3~0|avg_pool_3x3~1|+|avg_pool_3x3~0|nor_conv_3x3~1|none~2|) evaluate : loss=1.18, accuracy@1=63.35%, accuracy@5=95.75% | current test-geno is Structure(4 nodes with |avg_pool_3x3~0|+|avg_pool_3x3~0|avg_pool_3x3~1|+|avg_pool_3x3~0|nor_conv_3x3~1|skip_connect~2|) evaluate : loss=23631.38, accuracy@1=11.73%, accuracy@5=57.01% | current test-geno is Structure(4 nodes with |avg_pool_3x3~0|+|avg_pool_3x3~0|avg_pool_3x3~1|+|avg_pool_3x3~0|nor_conv_3x3~1|nor_conv_1x1~2|) evaluate : loss=2.63, accuracy@1=52.52%, accuracy@5=93.15% | current test-geno is Structure(4 nodes with |avg_pool_3x3~0|+|avg_pool_3x3~0|avg_pool_3x3~1|+|avg_pool_3x3~0|nor_conv_3x3~1|nor_conv_3x3~2|) evaluate : loss=2.54, accuracy@1=54.93%, accuracy@5=93.94% | current test-geno is Structure(4 nodes with |avg_pool_3x3~0|+|avg_pool_3x3~0|avg_pool_3x3~1|+|avg_pool_3x3~0|nor_conv_3x3~1|avg_pool_3x3~2|) evaluate : loss=21931.14, accuracy@1=11.81%, accuracy@5=57.27% | current test-geno is Structure(4 nodes with |avg_pool_3x3~0|+|avg_pool_3x3~0|avg_pool_3x3~1|+|avg_pool_3x3~0|avg_pool_3x3~1|none~2|) evaluate : loss=86.85, accuracy@1=10.03%, accuracy@5=50.57% | current test-geno is Structure(4 nodes with |avg_pool_3x3~0|+|avg_pool_3x3~0|avg_pool_3x3~1|+|avg_pool_3x3~0|avg_pool_3x3~1|skip_connect~2|) evaluate : loss=1080665.85, accuracy@1=12.91%, accuracy@5=54.70% | current test-geno is Structure(4 nodes with |avg_pool_3x3~0|+|avg_pool_3x3~0|avg_pool_3x3~1|+|avg_pool_3x3~0|avg_pool_3x3~1|nor_conv_1x1~2|) evaluate : loss=109.15, accuracy@1=10.89%, accuracy@5=58.14% | current test-geno is Structure(4 nodes with |avg_pool_3x3~0|+|avg_pool_3x3~0|avg_pool_3x3~1|+|avg_pool_3x3~0|avg_pool_3x3~1|nor_conv_3x3~2|) evaluate : loss=101.58, accuracy@1=12.16%, accuracy@5=60.64% | current test-geno is Structure(4 nodes with |avg_pool_3x3~0|+|avg_pool_3x3~0|avg_pool_3x3~1|+|avg_pool_3x3~0|avg_pool_3x3~1|avg_pool_3x3~2|) evaluate : loss=1165846.05, accuracy@1=13.00%, accuracy@5=54.12% | ..... (a lot of info like these)
However, I DO NOT find the result of Kendall tau? Could you provide the steps to reproduce your ranking accuracy result?
As far as I know, evaluating super-net is just a loop to obtain the proxy acc for each arch in the super-net and calculate the rank acc between the proxy acc and GT acc. So another question is that I cannot understand why the eval script should be run 4 times for the one-shot super-net (not parallel for speedup)?
Thank you for your asking, before answering your questions, I would like to list the specific steps to run our few-shot NAS on NasBench201.
Back to your questions, First, I would like to say that the type of the results to the screen was as desired. In other words, the script was running properly
For the first question(Kendall tau), you can follow the steps I list above, after running the step6, the Kendall tau will be out to your screen.
The second question is about the eval script. Actually, the total number of architectures in NasBench201 is 15625 with 5 different operator types, including skip_connection, average pooling, convolution3x3, convolution1x1, and none. I convert all architectures to a JSON file and then split it into 5 files based on the first operator in Nasbench201(The files are located on few-shot-NAS/Few-Shot_NasBench201/search_pool/). I did this is because it is convenient for the training few-shot models. Therefore, each file contains 3125 architecture. Back to the evaluation script, we should evaluate all 15625 architectures in NasBench201 to get the proxy accuracy, we run the eval script 5 times because the architectures to be evaluated are from the 5 files I mentioned before(You can go to the few-shot-NAS/Few-Shot_NasBench201/search_pool/ and open them to find the details). In other words, each time is to evaluate 3125 architectures and we run them 5 times to evaluate all 15625 architectures.
I am sorry for the confusion with the eval script. Thank you again for your asking.
Thanks! I have reproduced your one-shot results with Kendall tau 0.5384 (close to your paper 0.5436) :) . I am reproducing the few-shot result on nas-bench 201 now.
Thanks for your effort to replicate our results. We're looking forward to your results.
I cannot reproduce your few-shot results.
I run the few-shot scripts 5 times follow your readme.
In each training script
FLOP = 61.52 M, Params = 1.42 MB search-space : ['none', 'skip_connect', 'nor_conv_1x1', 'nor_conv_3x3', 'avg_pool_3x3'] try to create the NAS-Bench-201 api from /home/data/NASBench-102/NAS-Bench-201-v1_1-096897.pth [2021-07-14 13:27:57] create API = NASBench201API(15625/15625 architectures, file=NAS-Bench-201-v1_1-096897.pth) done => loading checkpoint of the last-info '/home/checkpoint_few_shot_nas/one-shot/seed-0-last-info.pth' start => loading checkpoint of the last-info '{'epoch': 600, 'args': Namespace(arch_nas_dataset='/home/data/NASBench-102/NAS-Bench-201-v1_1-096897.pth', channel=16, config_path='configs/nas-benchmark/algos/ONE-SHOT-SUPERNET.config', data_path='/home/data/NASBench-201/cifar10/', dataset='cifar10', edge_op=None, edge_to_split=None, max_nodes=4, num_cells=5, print_freq=200, rand_seed=0, save_dir='/home/checkpoint_few_shot_nas/one-shot', search_space_name='nas-bench-201', select_num=100, track_running_stats=0, workers=4), 'last_checkpoint': PosixPath('/home/checkpoint_few_shot_nas/one-shot/checkpoint/seed-0-basic.pth')}' start with 0-th epoch.
[Search the 000-150-th epoch] Time Left: [00:00:00], LR=0.025 SEARCH [2021-07-14 13:28:01] [000-150][000/391] Time 0.40 (0.40) Data 0.05 (0.05) Base [Loss 0.558 (0.558) Prec@1 82.81 (82.81) Prec@5 99.22 (99.22)] SEARCH [2021-07-14 13:28:18] [000-150][200/391] Time 0.09 (0.09) Data 0.05 (0.04) Base [Loss 2.289 (1.923) Prec@1 17.97 (27.60) Prec@5 58.59 (65.96)] SEARCH [2021-07-14 13:28:33] [000-150][390/391] Time 0.06 (0.08) Data 0.02 (0.04) Base [Loss 2.308 (1.970) Prec@1 7.50 (25.30) Prec@5 57.50 (64.64)] [000-150] search [base] : loss=1.97, a
I see the few-shot super-net is inherited from the one-shot super-net as expected.
After running, I obtain files as follows: /home/checkpoint_few_shot_nas/few-shot
drwxrwxrwx 2 root root 4096 Jul 15 02:50 checkpoint -rwxrwxrwx 1 root root 609555 Jul 15 09:40 few-shot-NAS_split_edge_ID_0_split_operation_conv1.log -rwxrwxrwx 1 root root 609622 Jul 15 12:02 few-shot-NAS_split_edge_ID_0_split_operation_conv3.log -rwxrwxrwx 1 root root 4185 Jul 15 02:51 few-shot-NAS_split_edge_ID_0_split_operation_none.log -rwxrwxrwx 1 root root 610791 Jul 15 14:15 few-shot-NAS_split_edge_ID_0_split_operation_pool.log -rwxrwxrwx 1 root root 610863 Jul 15 07:20 few-shot-NAS_split_edge_ID_0_split_operation_skip.log -rwxrwxrwx 1 root root 1327 Jul 14 14:50 seed-0-last-info-split-0-op-0.pth -rwxrwxrwx 1 root root 1327 Jul 14 14:57 seed-0-last-info-split-0-op-1.pth -rwxrwxrwx 1 root root 1327 Jul 14 15:29 seed-0-last-info-split-0-op-2.pth -rwxrwxrwx 1 root root 1327 Jul 14 15:12 seed-0-last-info-split-0-op-3.pth -rwxrwxrwx 1 root root 1327 Jul 14 14:59 seed-0-last-info-split-0-op-4.pth
Then, I run the few-shot/eval.sh
got the file few-shot-supernet
Finally, I run the rank.sh
got kendall tau 0.54295.
I run the code with pytorch 1.6.0, Cuda 10.1.
I cannot reproduce your few-shot results.
I run the few-shot scripts 5 times follow your readme.
In each training script
FLOP = 61.52 M, Params = 1.42 MB search-space : ['none', 'skip_connect', 'nor_conv_1x1', 'nor_conv_3x3', 'avg_pool_3x3'] try to create the NAS-Bench-201 api from /home/data/NASBench-102/NAS-Bench-201-v1_1-096897.pth [2021-07-14 13:27:57] create API = NASBench201API(15625/15625 architectures, file=NAS-Bench-201-v1_1-096897.pth) done => loading checkpoint of the last-info '/home/checkpoint_few_shot_nas/one-shot/seed-0-last-info.pth' start => loading checkpoint of the last-info '{'epoch': 600, 'args': Namespace(arch_nas_dataset='/home/data/NASBench-102/NAS-Bench-201-v1_1-096897.pth', channel=16, config_path='configs/nas-benchmark/algos/ONE-SHOT-SUPERNET.config', data_path='/home/data/NASBench-201/cifar10/', dataset='cifar10', edge_op=None, edge_to_split=None, max_nodes=4, num_cells=5, print_freq=200, rand_seed=0, save_dir='/home/checkpoint_few_shot_nas/one-shot', search_space_name='nas-bench-201', select_num=100, track_running_stats=0, workers=4), 'last_checkpoint': PosixPath('/home/checkpoint_few_shot_nas/one-shot/checkpoint/seed-0-basic.pth')}' start with 0-th epoch.
[Search the 000-150-th epoch] Time Left: [00:00:00], LR=0.025 SEARCH [2021-07-14 13:28:01] [000-150][000/391] Time 0.40 (0.40) Data 0.05 (0.05) Base [Loss 0.558 (0.558) Prec@1 82.81 (82.81) Prec@5 99.22 (99.22)] SEARCH [2021-07-14 13:28:18] [000-150][200/391] Time 0.09 (0.09) Data 0.05 (0.04) Base [Loss 2.289 (1.923) Prec@1 17.97 (27.60) Prec@5 58.59 (65.96)] SEARCH [2021-07-14 13:28:33] [000-150][390/391] Time 0.06 (0.08) Data 0.02 (0.04) Base [Loss 2.308 (1.970) Prec@1 7.50 (25.30) Prec@5 57.50 (64.64)] [000-150] search [base] : loss=1.97, a
I see the few-shot super-net is inherited from the one-shot super-net as expected.
After running, I obtain files as follows: /home/checkpoint_few_shot_nas/few-shot
drwxrwxrwx 2 root root 4096 Jul 15 02:50 checkpoint -rwxrwxrwx 1 root root 609555 Jul 15 09:40 few-shot-NAS_split_edge_ID_0_split_operation_conv1.log -rwxrwxrwx 1 root root 609622 Jul 15 12:02 few-shot-NAS_split_edge_ID_0_split_operation_conv3.log -rwxrwxrwx 1 root root 4185 Jul 15 02:51 few-shot-NAS_split_edge_ID_0_split_operation_none.log -rwxrwxrwx 1 root root 610791 Jul 15 14:15 few-shot-NAS_split_edge_ID_0_split_operation_pool.log -rwxrwxrwx 1 root root 610863 Jul 15 07:20 few-shot-NAS_split_edge_ID_0_split_operation_skip.log -rwxrwxrwx 1 root root 1327 Jul 14 14:50 seed-0-last-info-split-0-op-0.pth -rwxrwxrwx 1 root root 1327 Jul 14 14:57 seed-0-last-info-split-0-op-1.pth -rwxrwxrwx 1 root root 1327 Jul 14 15:29 seed-0-last-info-split-0-op-2.pth -rwxrwxrwx 1 root root 1327 Jul 14 15:12 seed-0-last-info-split-0-op-3.pth -rwxrwxrwx 1 root root 1327 Jul 14 14:59 seed-0-last-info-split-0-op-4.pth
Then, I run the few-shot/eval.sh
got the file few-shot-supernet
Finally, I run the rank.sh
got kendall tau 0.54295.
I run the code with pytorch 1.6.0, Cuda 10.1.
Thank you for reproducing the few-shot model. We would double-check our code to see if there is any difference between our implementations. Once we finish the check on my side, I will let you know here.
@Jxu-Thu thank you. Did you run each methods for once? Yiyang is investigating, and we'd like confirm this first. Thank you.
@linnanwang Yeah. I only run the experiments once.
I have a question. Previous work needs to re-tune the BN (batchnorm) [re-calcuate the BN by forwarding the whole validation set] parameters for each architecture when evaluating the performance of the arches in the search space. After that, they calculate the Kendall tau rank correlation.
It seems that you do not re-tune the BN in the script exp/supernet/one-shot-supernet_eval.py? Is there any differences between your implements with other methods?
I have a question. Previous work needs to re-tune the BN (batchnorm) [re-calcuate the BN by forwarding the whole validation set] parameters for each architecture when evaluating the performance of the arches in the search space. After that, they calculate the Kendall tau rank correlation.
It seems that you do not re-tune the BN in the script exp/supernet/one-shot-supernet_eval.py? Is there any differences between your implements with other methods?
Thank you for asking, refer to table 5 in NasBench201 original paper(https://arxiv.org/pdf/2001.00326.pdf), there are two blocks in the table representing different BN types, the BN type in the first block uses track_running_stats and in the second block, it does not keep running estimates but always use batch statistics.
For our script, take one-shot for example(bash ./supernet/one-shot/train.sh cifar10 0 NASBENCH201_PATH), the third parameter "0" controls the BN type, which is equivalent to the second BN type in the table(not keep running estimates but always use batch statistics). Therefore, this is why we do not re-tune the BN in the script.
@aoiang Many thanks! You are right. I found that some codes with BN=0 still re-tune BN, which is not necessary.
Any update of few-shot results?
Any update of few-shot results?
Hi, I have trained and tuned the few-shot supernet in recent days and the results are upgraded significantly, which is currently obvious better than one-shot model. This is a positive signal and I will upload the checkpoints of my few-shot model later today and you are free test them directly on your side. Thank you.
Any update of few-shot results?
Hi there, first, sorry for the late reply. Recently, I am doing a full-time job and am busy with work. Also, due to the limited computation resource, the experiment results would be out slowly.
So far, the Kendall tau is: few-shot: 0.606 V.S. one-shot: 0.502, where the few-shot model shows a significant improvement.
we have uploaded the checkpoints of the few-shot model into google drive, see here(https://drive.google.com/drive/folders/13sZBqPxQsaoxxsJqDA6moPPgMI_udiHL?usp=sharing). You are free to download and evaluate it by yourself. There is a README in the folder and you can follow the README to evaluate the models and get the Kendall tau. To further tune the models, the Kendall tau can be much higher. Thank you for your patience.
@aoiang great job Yiyang, thank you for the effort!
I run two experiments.
seed 0: one shot: 0.5384 few-shot (five super-net): 0.54295
seed 1: one shot: 0.530 few-shot (five super-net): 0.55288
I fail to reproduce your results based on your codes. I have tried my best to reproduce your results in the paper. However, I re-implemented your concepts of few-shot by myself and got one-shot 0.52 few-shot (five super-net) 0.69. [a big improvement]
It validates the effectiveness of few-shot concepts although I cannot reproduce your results based on your codes.
Anyway. Thanks for your kind reply and nice work :)
What are you talking about? First you have costed us nearly two weeks to help you get the checkpoint attached above, which gives you a rank correlation of 0.6 based on our implementation. You need respect our time, and read the results above. I don’t know what might be wrong, there must be something fishy in your codes.
Second, it looks like we establish a common ground that few shot NAS is effective. You independently replicated few shot NAS and tested on NASBench 201. Here is the rank correlation: few shot 0.69 v.s. one shot 0.52. 0.69 is actually better than 0.65 claimed in paper.
Alright, to others that might have a future question like this, this is a great thread to read. It looks like we all nail down the conclusion that few shot NAS is effective, and let’s move forward and focus on the right thing. We will leave the thread open as a reference to other people.
Thank you all and great job Yiyang!
Any update of few-shot results?
Hi there, first, sorry for the late reply. Recently, I am doing a full-time job and am busy with work. Also, due to the limited computation resource, the experiment results would be out slowly.
So far, the Kendall tau is: few-shot: 0.606 V.S. one-shot: 0.502, where the few-shot model shows a significant improvement.
we have uploaded the checkpoints of the few-shot model into google drive, see here(https://drive.google.com/drive/folders/13sZBqPxQsaoxxsJqDA6moPPgMI_udiHL?usp=sharing). You are free to download and evaluate it by yourself. There is a README in the folder and you can follow the README to evaluate the models and get the Kendall tau. To further tune the models, the Kendall tau can be much higher. Thank you for your patience.
Hi aoiang,
Thanks for your nice work and the code. Unfortunately, when I strictly followed the README to reproduce the results using your released code, I only got the Kendall’s Tau 0.591 for few-shot with 5 supernets. I wonder how to get the Kendall’s Tau 0.653 as reported in your paper? I provide my running log and evaluation results as below. few_shot_5_supernets_exp.zip Could you please help me find out why this difference?
Great thanks and best wishes.
@ShunLu91 @aoiang The Kendall’s Tau 0.653 as reported in this paper is a mean value, that means there are 6 choice to split 1 edge to generate 5 sub-supernet . I guess if I choose any other edge(one of six edges) to split , it maby could get better Kendall’s Tau ,such as above 0.653. In your training code , could user change the different edge to split ? OPERATION_TO_SPLIT(0-4) just give the choice of different operations in ONE edge, any other edge maybe better.
@Pcyslist Thanks for your constructive suggestions. I concur with you that different edge_to_split
may yield different results as Kendall’s Tau is really sensitive according to my experience. After checking the experiments above, I adopted the code provided by @aoiang and found that the default edge_to_split
is 0 as defined here. Generally, it's an interesting discussion and worth studying. I will try different edge_to_split
following your advice in the future.
@ShunLu91 I have checked your few-shot training log, I found that when you trained five subnetworks, which you transferred different model weights to . This is incorrect. You should always transfer the model weights obtained by one-shot training to the sub-supernet.
@ShunLu91 In addition, I would like to ask what type of GPU do you use? Why it is so slow for me to train supernet of one-shot with two 3090 gpus?
@Pcyslist Sorry but I have to clarify that you might have some misunderstandings about my training. I followed the official instructions here to conduct the few-shot training rather than the one-shot training. I split the one-shot supernet into 5 sub-supernets. Thus for evaluation, I transferred different model weights for each sub-network from their corresponding sub-supernet. As for the part of the transfer learning, the author said "For this experiment, we train sub-supernets by skipping the transfer learning described in Section 3.2." in Section 4.1.1 in their paper. Therefore, the experiments on NAS-Bench-201 do not have the part of the transfer learning. Besides, I trained each sub-supernet on a single card of NVIDIA V100 GPU. I am glad for more discussions.
"I followed the official instructions here to conduct the few-shot training rather than the one-shot training."——Before you conduct the few-shot training, you must conduct one-shot training at least once. Because the one-shot training generates the weight of original supernet which is used transferred to each sub-supernet. And then you can conduct few-shot training with initialized sub-supernet five times.
"Thus for evaluation, I transferred different model weights for each sub-network from their corresponding sub-supernet."——You should transfer the same weights (one-shot supernet) to each sub-supernet, because one-shot supernet is the parent supernet of 5 sub-supernets.
"For this experiment, we train sub-supernets by skipping the transfer learning described in Section 3.2." in Section 4.1.1 in their paper. Therefore, the experiments on NAS-Bench-201 do not have the part of the transfer learning."——No, it's not totally right. Only in the experiment of gradient based algorithm, author doesn't use transfer learning. As for search based algorithm, author conduct transfer learning obviously shown in figure6.
I conduct one-shot training for 19 hours using two NVIDIA 3090 GPUs, far away from the result 6.8 hours in paper. what method can I adopt to accelerate training speed ?
---Original--- From: "Shun @.> Date: Thu, Mar 23, 2023 14:11 PM To: @.>; Cc: @.**@.>; Subject: Re: [aoiang/few-shot-NAS] reproduce your one/few-shot results onnas-bench 201 (#1)
@Pcyslist Sorry but I have to clarify that you might have some misunderstandings about my training. I followed the official instructions here to conduct the few-shot training rather than the one-shot training. I split the one-shot supernet into 5 sub-supernets. Thus for evaluation, I transferred different model weights for each sub-network from their corresponding sub-supernet. As for the part of the transfer learning, the author said "For this experiment, we train sub-supernets by skipping the transfer learning described in Section 3.2." in Section 4.1.1 in their paper. Therefore, the experiments on NAS-Bench-201 do not have the part of the transfer learning. Besides, I trained each sub-supernet on a single card of NVIDIA V100 GPU. I am glad for more discussions.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
@Pcyslist Thanks for pointing our this error. I notice that a transfer learning procedure was missing in my training and I will repeat this experiment. In summary, I should first transfer the same weights from a pre-trained one-shot supernet and then train the few-shot sub-supernet. After the few-shot training, I can use the corresponding few-shot sub-supernet to evaluate each subnetwork. Is my understanding correct now? In this way, the training cost is obviously higher than the conventional one-shot NAS because of this post-training, i.e. the transfer learning and then few-shot.
As for the training speed, I have chekced my previous one-shot training logs on one NVIDIA V100 GPU and found that the training time is nearly the same as reported 6.8 hours. Maybe you can increase the num_workers
to accelerate training speed.
@ShunLu91 Yes, you can have a try. I have also solved the problem of how to accelerate training. I have found that using a single GPU is much faster than using two GPUs.
@Pcyslist OK~Great thanks!
@ShunLu91 Hi!I wonder how long it took you to evaluate 15625 networks? I found that evaluating 15625 networks(about 15 hours) takes longer than training a supernet(about 6 hours), is this normal?
@Pcyslist I adopted a large valid batch size 512 with num_workers=16 when I evaluated each sub-network. It only took 6.5 hours for me to evaluate 15625 sub-networks. I suggest you to find out the most time-consuming step and then optimize it.
@Pcyslist I adopted a large valid batch size 512 with num_workers=16 when I evaluated each sub-network. It only took 6.5 hours for me to evaluate 15625 sub-networks. I suggest you to find out the most time-consuming step and then optimize it.
Thank you, I have completed all the processes of the recurring experiment, and the experimental results obtained by running the rank.py file are as follows: I am slightly disappointed with this result. Obviously, few-shot nas does not improve kendall_tau much, which shows that even select the same edge(edge 0) to split the supernet, the result of each training has great randomness. @ShunLu91
@Pcyslist Thanks for your results, which are an important reference for me and other users.
I try to reproduce your one-shpt results on nas-bench 201. By running the code https://github.com/aoiang/few-shot-NAS/blob/main/Few-Shot_NasBench201/supernet/one-shot/run.sh
and eval https://github.com/aoiang/few-shot-NAS/blob/main/Few-Shot_NasBench201/supernet/one-shot/eval.sh
However, your eval.sh is
for (( c=0; c<=4; c++ )) do OMP_NUM_THREADS=4 python ./exps/supernet/one-shot-supernet_eval.py \ --save_dir ${save_dir} --max_nodes ${max_nodes} --channel ${channel} --num_cells ${num_cells} \ --dataset ${dataset} --data_path ${data_path} \ --search_space_name ${space} \ --arch_nas_dataset ${benchmark_file} \ --config_path configs/nas-benchmark/algos/FEW-SHOT-SUPERNET.config \ --track_running_stats ${BN} \ --select_num 100 \ --output_dir ${OUTPUT} \ --workers 4 --print_freq 200 --rand_seed 0 --edge_op ${c} done
Why using te FEW-SHOT-SUPERNET.config?