melodyguan / enas

TensorFlow Code for paper "Efficient Neural Architecture Search via Parameter Sharing"
https://arxiv.org/abs/1802.03268
Apache License 2.0
1.58k stars 390 forks source link

Runtimes and results #66

Open philtomson opened 6 years ago

philtomson commented 6 years ago

In section 3.1 Results in the paper it says "Running on a single Nvidia GTX 1080Ti GPU, ENAS finds a recurrent cell in about 10 hours." I ran th e suggested script (./scripts/ptb_search.sh) and it took about 30 hours to complete running with a TitanV GPU card.

I'm curious about the output files after the script completed: There are several model.ckpt* files (checkpoint files) A graph.pbtxt (which was created much earlier - I'm guessing it's the graphviz graph for the best RNN cell discovered?)

Could you add instructions to the README.md for getting the graphviz output?

In the stdout file I can see near the end:

Here are 10 architectures
[ 0  0  0  1  0  2  0  3  0  3  0  4  0  6  0  7  0  5  0  8  0 10  0] rw=0.553
[ 0  0  0  1  0  1  0  2  0  4  0  5  0  4  0  4  0  7  0  9  0 10  0] rw=0.581
[ 0  0  0  1  0  2  0  3  0  4  0  5  0  6  0  7  0  8  0  9  0 10  0] rw=0.461
[ 0  0  0  1  0  1  0  1  0  3  0  5  0  6  0  6  0  7  0  9  0 10  0] rw=0.631
[ 0  0  0  1  0  1  0  3  0  3  0  3  0  6  0  7  0  7  0  7  0 10  0] rw=0.925
[ 0  0  0  1  0  1  0  3  0  1  0  3  0  5  0  6  0  8  0  9  0 10  0] rw=0.722
[ 0  0  0  1  0  1  0  2  0  4  0  5  0  4  0  5  0  8  0  9  0 10  0] rw=0.888
[ 0  0  0  1  0  1  0  2  0  3  0  4  0  6  0  7  0  8  0  8  0 10  0] rw=0.499
[0 0 0 1 0 2 0 3 0 4 0 4 0 5 0 7 0 7 0 8 0 9 0] rw=0.459
[ 0  0  0  1  0  2  0  3  0  4  0  4  0  5  0  7  0  8  0  9  0 10  0] rw=0.684
Epoch 100: Eval
Eval at 132700
valid_total_loss: nan
valid_log_ppl: nan
valid_ppl: nan

(some nan's there).

I think 'rw' stands for rewards here. I'd guess in that case we want the model with the highest reward? (0.925 in this case)

hex0102 commented 6 years ago

That script also took me more than 24 hours, then I stop the training. Way more than 10 hours.