Open hkhrais opened 4 years ago
Dear @hkhrais ,
I think the meaning is not pursue that using less layer is better. Only get experience, in this data the more layer is maybe not linear than better. With me, I will not concentrate to more dense (I tried 3-layer and the result is the same to 2-layer model, loss more time). As book's recommendation we can try with another method to improve model. I did not use Relu, and constitute by tanh and got a little improvement. https://colab.research.google.com/drive/15kQxc7vqVoMGj2zgzVN9ss4iqukjHkrE?authuser=1 I hope it is useful to you!
In this example (IMDB) the author asks to use 1 or 3 hidden layers
when using 2 hidden layers as he did you get 0.884 ... when i used one hidden layer i get 0.887 which got me confused,, not sure why 2 layers sound like "optimal architecture"?
ref: https://github.com/fchollet/deep-learning-with-python-notebooks/blob/master/3.5-classifying-movie-reviews.ipynb