Closed eliork closed 3 years ago
Hello,
Have you tried stacking frames? Did it show any improvements in performance(learning-wise in timesteps and wall time)?
yes I did. It only helps if you have communication delays and if you have a continuity cost and want to dump oscillations (which I have in newest, yet unpublished version of that project ^^")
I'm doing that (frame-stacking + command history at the same time) here: https://github.com/DLR-RM/rl-baselines3-zoo/blob/master/hyperparams/sac.yml#L314
Another question I had was also if the concatenation of command history improve the learning compared to without using any command history?
As mentioned in the blog post, this is again for not breaking the markov assumption. It is important to have n_history > 1 (also when you have continuity cost)
Thank you!
https://github.com/araffin/learning-to-drive-in-5-minutes/blob/ccb27e66d593d6036fc1076dcec80f74a3f5e239/hyperparams/sac.yml#L16
Hey, from what I see here you didn't stack frames. Have you tried stacking frames? Did it show any improvements in performance(learning-wise in timesteps and wall time)? Another question I had was also if the concatenation of command history improve the learning compared to without using any command history? I am trying both methods but I can't find any significant difference in my experiments. Thank you!