Like many other people (See #18 and #19), I cannot reproduce the results of the paper.
I used the commit af8417bfc82a3e249b4b02156518d775f29eb289 of meta world and the same parameters as in the paper and the docs.
I originally thought that my results were very different because of the lack of seeds. So I ran 10 seeds for each like in the paper. It did not help. So I ran 20 seeds. Same problem. I used seeds from 1 to 20. Here is the training success:
As you can see the two methods yields nearly the same results. The variance is also very high compared to the paper.
Am I doing something wrong? There are about 20 issues on this GitHub at the time I am writing this and this is the third one about reproducibility. Could you provide exact commands to recreate your results maybe?
Problem
Like many other people (See #18 and #19), I cannot reproduce the results of the paper. I used the commit
af8417bfc82a3e249b4b02156518d775f29eb289
of meta world and the same parameters as in the paper and the docs.Here are the commands I ran:
Multi-task SAC
CARE
I originally thought that my results were very different because of the lack of seeds. So I ran 10 seeds for each like in the paper. It did not help. So I ran 20 seeds. Same problem. I used seeds from 1 to 20. Here is the training success:
As you can see the two methods yields nearly the same results. The variance is also very high compared to the paper.
Am I doing something wrong? There are about 20 issues on this GitHub at the time I am writing this and this is the third one about reproducibility. Could you provide exact commands to recreate your results maybe?
Thanks
System information