melodyguan / enas

TensorFlow Code for paper "Efficient Neural Architecture Search via Parameter Sharing"
https://arxiv.org/abs/1802.03268
Apache License 2.0
1.58k stars 390 forks source link

Reproducibility of the results from the paper (RNN) on new repository #106

Closed pingguokiller closed 4 years ago

pingguokiller commented 5 years ago

@melodyguan

Thanks for your great paper and sharing the code . I clone the code from https://github.com/google-research/google-research/tree/master/enas_lm, which is the new repository for ENAS(RNN).

I follow your instructions without changing any code :

  1. python -m enas_lm.src.process
  2. python -m enas_lm.src.search \ --output_dir="$(pwd)/output" \ --data_path="$(pwd)/ptb/ptb.pkl"

But I have run two times exeriment both more than 13 hours with RTX2080ti, tensorflow 1.13.1. The ppl is stopping at 148 for a long time about 10 hours in both exeriments.

Here is the result of one experiment:

step=90 ent=2.295 ppl=145.01 rw=0.5517 bl=0.5388 arc=[0 3 1 0 2 0 3 0 4 0 4 0 5 0 5 0 8 0] step=100 ent=2.403 ppl=169.88 rw=0.4709 bl=0.5389 arc=[0 0 1 0 2 0 2 0 4 1 5 0 6 0 7 0 8 0] step=110 ent=2.122 ppl=141.95 rw=0.5636 bl=0.5389 arc=[0 0 1 0 2 0 3 0 4 0 4 0 6 0 7 0 5 0] step=120 ent=2.027 ppl=152.03 rw=0.5262 bl=0.5388 arc=[0 0 1 0 2 0 3 0 4 0 5 0 3 0 5 0 8 0] step=130 ent=2.362 ppl=137.58 rw=0.5815 bl=0.5389 arc=[0 0 1 0 2 0 3 0 3 0 5 0 6 0 7 0 7 0] step=140 ent=2.154 ppl=134.04 rw=0.5968 bl=0.5390 arc=[0 0 1 0 2 0 2 0 2 0 2 0 6 0 6 0 7 0] step=150 ent=2.233 ppl=146.65 rw=0.5455 bl=0.5389 arc=[0 0 1 0 2 0 3 0 3 0 3 0 5 0 6 0 6 0] step=160 ent=2.274 ppl=172.86 rw=0.4628 bl=0.5391 arc=[0 0 1 0 2 0 3 0 2 0 5 0 5 0 7 0 8 0] step=170 ent=2.344 ppl=137.55 rw=0.5816 bl=0.5390 arc=[0 0 1 0 2 0 3 0 3 0 5 0 6 0 7 0 8 0] step=180 ent=2.294 ppl=165.94 rw=0.4821 bl=0.5391 arc=[0 0 1 0 2 0 3 0 2 0 4 0 6 0 7 0 7 0] step=190 ent=2.400 ppl=147.35 rw=0.5429 bl=0.5392 arc=[0 0 1 0 2 1 3 0 3 0 4 0 6 0 7 0 7 0] step=200 ent=2.349 ppl=154.03 rw=0.5194 bl=0.5391 arc=[0 0 1 0 2 0 2 0 4 0 3 3 6 0 7 0 7 0] step=210 ent=2.150 ppl=149.75 rw=0.5342 bl=0.5391 arc=[0 1 1 0 2 2 2 0 2 0 5 0 6 0 4 0 8 0] step=220 ent=2.470 ppl=133.76 rw=0.5981 bl=0.5391 arc=[0 0 1 0 2 0 2 0 3 3 5 0 6 0 6 0 7 0] step=230 ent=2.102 ppl=145.73 rw=0.5490 bl=0.5391 arc=[0 0 1 0 2 0 3 0 4 0 5 0 3 0 7 0 8 0] step=240 ent=2.218 ppl=174.26 rw=0.4591 bl=0.5392 arc=[0 0 1 0 2 0 3 0 2 0 3 3 6 0 6 0 8 0] valid_ppl=148.22 trian child net epoch=1553 step=546000 ppl=213.04 lr=14.29 |w|=10433.17 |g|=0.21 mins=834.37 child_x_train=(128, 25) INFO:tensorflow:Saving checkpoints for 546070 into outputs/model.ckpt.

In my opnion, the paper said 'Running on a single Nvidia GTX 1080Ti GPU, ENAS finds a recurrent cell in about 10 hours' and the result could be get in less than 10 hours at 2080ti.

Is there anything wrong? How can I get the approximant results in paper.
Could you help me? Looking forward to your attention.

Thanks!