Closed Jianshu-Hu closed 3 years ago
These were the results I got.
Hi, we never ran any experiments with TD3 plus demos on any of the goal based environments (we only used it for penspin). I'm not necessarily that surprised it doesn't work because I don't think TD3 with dense rewards works well on any goal-based dexterous manipulation task. It would be interesting to try combining HER with demos in a similar way which might work a bit better, I did run some initial experiments along these lines but never fully investigated it.
Thank you so much for your reply. It is great to hear your advice of applying HER with some demonstrations. I will try this idea later and hopt it will work. Thanks!
I was trying to apply the method of TD3_plus_demos on the tasks like TwoEggCatchUnderArm-v0 and EggCatchUnderarm-v0. I copied the demostrations from the folder
/prerun_trajectories
to/TD3_plus_demos
, modified corresponding directories and run themain.py
. I noticed that there is not any improvement after training for long time steps. I would appreciate it if you can give some advice. Thanks in advance!