XuGW-Kevin / DrM

DrM, a visual RL algorithm, minimizes the dormant ratio to guide exploration-exploitation trade-offs, achieving significant improvements in sample efficiency and asymptotic performance across diverse domains.
MIT License
53 stars 8 forks source link

DrM 在 Metaworld 环境下的性能问题 #2

Closed jzndd closed 3 months ago

jzndd commented 4 months ago

尊敬的作者: 您好!在复现 DrM 的过程中,我们发现了其在 DMC-hard 环境下的惊人表现;但是,我们发现在许多 Metaworld 环境下性能似乎表现不佳,表现在训练至 2m 步时其训练成成率仍然只有 2/10左右,这种现象出现在 stick-pull-v2, pick-place-wall-v2, hammer-v2 , 特别地,其在 disassemble-v2 下表现尤为不佳,训练 2m 之后成功率仍然只有 0/10 . 上述提到的每一个问题我都做了 5 个 seed 以确保问题的可复现性, 并且参数都使用了您所提供的默认参数,想请教这种性能不佳的情况在您实际运行的过程中是否有出现过? 祝好! image

cheryyunl commented 4 months ago

Thank you for your problem report. Could you tell me your mujoco, mujoco_py, gym, and MetaWorld version? Also, the camera view used in MetaWorld is very important in visual RL. Could you check your camera view and MetaWorld settings? For disassemble environment, we have provided DrM visualization in our website.

Thank you again for the valuable suggestions. We will also check the problem reported again with our repo. If possible, we will also release our policy checkpoints trained by DrM in two weeks.

jzndd commented 4 months ago

Thank you very much for your prompt response to my problem report. I have reviewed and can confirm the versions of my environment as follows: MuJoCo: 3.1.5 mujoco-py: 2.1.2.14 MetaWorld: 0.0.0 (Located at /data/jzn/workspace/DrM/Metaworld) Camera View: "corner2" All of my settings follows the instruction in readme.md, and i never modify any code in the repo. Futhermore, if possible, I would like to kindly request if you could share your raw training curves, including success rate recordings. Best regards.

cheryyunl commented 4 months ago

Good. We found that in the last version, the version conflict happens when following our Readme to build MetaWorld environment. Now we are testing if all configs are right.

cheryyunl commented 3 months ago

Hi we have completed our test. The metaworld environment needs action normalization wrapper and the layernorm. We will update our code today. 1716785014625

cheryyunl commented 3 months ago

fix the bug in metaworld wrapper. Now all the implementations are correct.