Hi, this might be not the place but I am just wondering what hyperparameters did you use to train the SAC agent (data collection policy) for Humanoid Walk? The default hyperparameters successfully achieve expert level performance for 1M steps for Walker Walk and Cheetah Run. I use this codebase as mentioned in the README.
Hi, this might be not the place but I am just wondering what hyperparameters did you use to train the SAC agent (data collection policy) for Humanoid Walk? The default hyperparameters successfully achieve expert level performance for 1M steps for Walker Walk and Cheetah Run. I use this codebase as mentioned in the README.