Implementation of Reinforcement Learning Algorithms. Python, OpenAI Gym, Tensorflow. Exercises and Solutions to accompany Sutton's Book and David Silver's course.
Is there a smart way/procedure to set hyper parameters?
1-Parameter Initialization method (Xavier or others)
2- LSTM length (in case of RNN)
3- Optimizer method (Adam or RMSProp)
4- Learning rate
5- Gradient Clipping value
6- Reward value (in case of reinforcement learning)
7- ...
After 10s of experiments, I found that any tiny change in one of these affects the whole training dramatically, usually in a bad way!
Currently, I am not searching for the best combination, I am just searching for a good one.
It is also not logic to conduct a "grid search" over different parameters, because a single experiment may take hours or days, and cost a lot of money.
One trick I usually use, is to use large network and dropout to reduce/eliminate over fitting, but what about all of the above?
Another trick, try to adjust the learning rate * gradient = 1e-3 parameters. (In other works make the parameter update around 1/1000 of the parameter value, to prevent too large to too small updates)
Hi @dennybritz
Is there a smart way/procedure to set hyper parameters?
1-Parameter Initialization method (Xavier or others) 2- LSTM length (in case of RNN) 3- Optimizer method (Adam or RMSProp) 4- Learning rate 5- Gradient Clipping value 6- Reward value (in case of reinforcement learning) 7- ...
After 10s of experiments, I found that any tiny change in one of these affects the whole training dramatically, usually in a bad way!
Currently, I am not searching for the best combination, I am just searching for a good one.
It is also not logic to conduct a "grid search" over different parameters, because a single experiment may take hours or days, and cost a lot of money.
One trick I usually use, is to use large network and dropout to reduce/eliminate over fitting, but what about all of the above?
Another trick, try to adjust the learning rate * gradient = 1e-3 parameters. (In other works make the parameter update around 1/1000 of the parameter value, to prevent too large to too small updates)
What do you recommend?