Closed yaukaizhi closed 1 year ago
Hi,
Do you mean the parameter sizes in hidden layers or the number of hidden layers?
I will assume the question is about the amount of parameters, as the general rule of thumb is to "blow up" your encoder feature space no more than 2 times per layer in order to allow extraction of more complex features in later layers. Here, however, the input space is rather small and generally not very complex. Most likely not a lot of hidden features could be extracted, so just parallelization of calculation seems to be enough.
Some quick answer can be found here: https://stats.stackexchange.com/questions/214360/what-are-the-effects-of-depth-and-width-in-deep-neural-networks
In any case the implementation is based on similar DDPG and TD3 problem solutions with similar layer widths: https://ieeexplore.ieee.org/abstract/document/8202134 https://pemami4911.github.io/blog/2016/08/21/ddpg-rl.html
why set 600 and 800 for hidden layers? genuinely curious, I've read that a rule of thumb that the number of hidden layers should be roughly equal to input/output layers. Thanks!!