openai / gym

A toolkit for developing and comparing reinforcement learning algorithms.
https://www.gymlibrary.dev
Other
34.49k stars 8.59k forks source link

Mujoco tasks highest score or when one task is sovled? #1776

Closed zhan0903 closed 2 years ago

zhan0903 commented 4 years ago

Hi, I am wondering for Mujoco tasks such as Hopper-v2, HalfCheetah-v2, and Swimmer, what are these the highest score of these tasks we can gain? Or how to judge when a task is solved? Thanks.

jacekplocharczyk commented 4 years ago

Hi, if you are creating environments using gym.make(ENV_NAME) you can check done threshold in this file: https://github.com/openai/gym/blob/master/gym/envs/__init__.py

zhan0903 commented 4 years ago

Hi, if you are creating environments using gym.make(ENV_NAME) you can check done threshold in this file: https://github.com/openai/gym/blob/master/gym/envs/__init__.py

Thanks for your response! From the file, the reward threshold of the environment "HalfCheetah-v2" is 4800, but based on some baselines, e.g., SAC or TD3 ( they create environments using gym.make(ENV_NAME)), they can achieve a final score around 10000. I am confused about it. Thanks.

jacekplocharczyk commented 4 years ago

Yeah I know that pain 😝

They probably used *-v1 versions and there could be a different reward. But I don't know much about those versions.

zhan0903 commented 4 years ago

Yeah I know that pain 😝

They probably used *-v1 versions, and there could be a different reward. But I don't know much about those versions.

Thanks. I have trained SAC on "HalfCheetah-v2" for 1 million steps, the final score is also over 10000. BTW, if there is no "reward_threshold" (e.g.," Walker2d-v2" and "Humanoid-v2") in that file, does it means there is no limitation on its maximum score?

jkterry1 commented 2 years ago

PR #2762 is about to be merged, introducing V4 MuJoCo environments using new bindings and a dramatically newer version of the engine. If this issue still persists with the V4 ones, please create a new issue for it.