katerakelly / oyster

Implementation of Efficient Off-policy Meta-learning via Probabilistic Context Variables (PEARL)
MIT License
472 stars 125 forks source link

Questions about result figures #28

Closed ligengen closed 2 years ago

ligengen commented 2 years ago

Hi, I have a question of result figures in section 6 comparing six different benchmarks. The caption says that "Test-task performance vs. samples collected during meta-training". Is it AverageReturn_all_test_tasks vs. Number of env steps total in the progress.csv? I drew all of them but they can not always match.

Also, how do you draw the figure with some light-colored shade? Does the shade mean the variance? If so, does it refer to the variance of 3 different random seeds' average return?

Also, I trained the ant-goal. The best return is -440, also have a big gap with your result (-200). But the rest of five is comparable. What it might be the reason?

Thank you in advance!

Figure_2

ori
XyDrKRulof commented 2 years ago

The plots in the paper are a bit misleading. For the Ant-Goal-2D you have a total of ten milion steps, while for the other benchmarks you only have one milion steps. Your plot seems to be comparable to that of the paper if you take a look at the performance at around one milion steps