pybullet HumanoidBulletEnv-v0 score is not reasonable

openai / spinningup

An educational resource to help anyone learn deep reinforcement learning.

https://spinningup.openai.com/

MIT License

10.21k stars 2.23k forks source link

pybullet HumanoidBulletEnv-v0 score is not reasonable #174

Open zhan0903 opened 5 years ago

zhan0903 commented 5 years ago

Hi, I use the spinningup td3 algorithm to test pybullet's HumanoidBulletEnv-v0 environment, but got the test score around 1600 even from the beginning which is not normal(td3 should not work in this benchmark), Does anyone have similar results? Thank you.

shuangwu commented 4 years ago

I can confirm this, actually ddpg/td3/sac all show similar unreasonable AverageTestEpRet. plot

shuangwu commented 4 years ago

You can see plots of all algorithms here, it seems like all off-policy algorithms have the same behavior, while all on-policy algorithms seem to be fine. plot