quark0 / darts

Differentiable architecture search for convolutional and recurrent networks
https://arxiv.org/abs/1806.09055
Apache License 2.0
3.92k stars 843 forks source link

Can not reproduce the search result on PTB dataset (Fig3)? #155

Open fortunechen opened 3 years ago

fortunechen commented 3 years ago

Hi, everyone.

I try to reproduce the model search performance in Fig 3 on PTB dataset. Becauce it tells us as searching time growing, the searched model will be improved.

Fig3

I run the train_search.py code 4 times with different random seed. Then I get the result below. The x-axis denotes how many gpu hours I trained to get the model and y-axis denotes the performance of searched model after training for 300 epoch. Except for neccssacy changes to run the code successfully on pytorch1.7 +cuda 11.. I didn't change any code in original code.

The result shows below, we can see that as searching time grows, the searched model will NOT be improved.

my reproduce

fortunechen commented 3 years ago

Here is the code I used darts-rnn with pytorch1.7+cuda11.0

fortunechen commented 3 years ago

I will try reproduce the result on pytorch 0.3.1 again.

chaoji90 commented 3 years ago

Empirically, darts search time and performance are often inversely proportional and the early stop is usually inevitable.

fortunechen commented 3 years ago

Empirically, darts search time and performance are often inversely proportional and the early stop is usually inevitable.

I agree so. However the total search time is the same as the paper(50 epochs). and the preformace is much lower than the author's.

fortunechen commented 3 years ago

I will try reproduce the result on pytorch 0.3.1 again.

Here is the result without any modification of original code.

result_origin