Open guoyijie opened 3 years ago
@avnishn
@guoyijie the current set of V1 metaworld environments is highly sensitive to seed, so I wouldn't say its out of the question that MT1 pick and place got such low performance.
We're releasing the metaworld-v2 environments in mid november, and I expect to see a performance increase under different seeds.
Can you upload a tensorboard link, as well as try running the experiment once more?
Thanks! Avnish
@guoyijie you can send us a tensorboard link by using the https://tensorboard.dev service.
Do you mind trying MT1-reach? That would be a more reliable indictation of a possible problem.
Thanks for your reply. Here I provide the additional information as required.
(1) "tensorboard link": Here is the log I got by running the script "mtsac_metaworld_mt1_pick_place.py" https://tensorboard.dev/experiment/1Y42H2DbRUWobvG4Esdd9w/
(2) "run the experiment once more": actually I tried to run the script several times (though with the same seed 1), the result is always the same that the policy is not able to get the positive success rate. Now I'm trying to run with a different seed, not sure whether it will work or not.
(3) "try MT1-reach": yes, I tried with MT1-reach. With the learned policy, this task can be easily solved with a success rate of 1.
Could you please let me know whether there is something wrong in the tensorboard log and any further suggestions?
For the code in examples/torch/mtsac_metaworld_mt1_pick_place.py, the policy is not able to learn a good policy. After 10e6 environment steps, the success rate is still 0 and the average return is always negative. Is it the expected result?
I installed the metaworld and garage with the command "pip install -e .[dev]"