TomVeniat / SANAS

Stochastic Adaptive Neural Architecture Search
66 stars 7 forks source link

Accuracy on test set #2

Closed vebh96 closed 5 years ago

vebh96 commented 5 years ago

I am trying to reproduce the accuracy as mentioned in the SANAS paper. It says an accuracy of 86.5 % but even after running the model for 2000 epochs the test accuracy is around 43%. Here is one such epoch. Train: 100%|██████████| 348/348 [04:20<00:00, 2.66it/s] Validation: 100%|██████████| 49/49 [00:35<00:00, 3.11it/s] Test: 100%|██████████| 49/49 [00:34<00:00, 1.42it/s] INFO - main - Losses: 2.487(-1.368E+00)-2.467-2.484, Accuracies: 0.427-0.433-0.428, Avg cost: 1.246E+08-1.246E+08-1.246E+08 INFO - main - [175.0, 25.0, 45.0, 84.0, 176.0, 196.0, 158.0, 111.0, 62.0, 135.0, 269.0, 345.0] INFO - main - Best Val: 0.439 - Test: 0.433 (Epoch 345.0) How do i attain the claimed accuracy?

TomVeniat commented 5 years ago

Hi, Something is indeed wrong, 2000 epochs should be more than enough to train a model. Could you provide the command you executed and/or what you modified in the code if you tried anything different?

vebh96 commented 5 years ago

The command that I run is: python3 ./main.py with adam speech_commands gru kwscnn static=True use_mongo=False ex_path=./runs use_visdom=False But after some hyperparatemer tuning I was able to increase it and the best model has reached an accuracy of 83.6% with both the learning rates as 1e-5. Could you recommend how do I further increase the accuracy?

TomVeniat commented 5 years ago

Right, we used learning rates between 3e-5 and 7e-5 in our experiments. I will update the repo with this information. How many epochs did it take to reach this accuracy ? We observed that even when the accuracy seemed to plateau, training longer (a few hundreds epochs) resulted in increased test accuracy. I think that coupling a learning rate scheduler with ADAM could make things easier, but we have not tried yet.

TomVeniat commented 5 years ago

Oh, I just realized that the 86.5% accuracy you are referring to is actually the proportion of words matched, computed on the streaming audio task as described in the Speech Commands paper, sec 7.2. Accuracies on the test set are represented in fig.4 and are close to the accuracy you've obtained in your last experiment.

vebh96 commented 5 years ago

I got 83.6 at around 900 epochs.I didnt realize that the results were based on the streaming audio task. Do you think we could use the SANAS architecture on a Convolutional Recurrent keyword spotting model or one using attention modelling?

TomVeniat commented 5 years ago

Hi, These would be two interesting extensions yes. Sorry for the late response, let me know if you try something along those lines.