Closed floraljq closed 7 months ago
Hi, the code for these baselines are in another branch. Here is the config file https://github.com/twni2016/pomdp-baselines/blob/all-methods/PPO/run_all.yaml
So I think the main difference may be the num-steps
that I set to 2048.
Dear author,
I have read your paper on MuJoCo experiments and I am particularly interested in the hyperparameters used for PPO_GRU and A2C_GRU. I would greatly appreciate it if you could provide me with the code or detailed information regarding these hyperparameters.
In my own implementation, specifically for PPO_GRU, I utilized the code from https://github.com/ikostrikov/pytorch-a2c-ppo-acktr-gail, and set num-steps to 128, num-processes to 16, and num-mini-batch to 16. However, the results I obtained were significantly worse than the ones reported in your paper.
I kindly request your assistance in understanding if there are any additional parameters or considerations that I might have overlooked. Your expert guidance would be immensely valuable to me.
Here are my complete parameters: