katerakelly / oyster

Implementation of Efficient Off-policy Meta-learning via Probabilistic Context Variables (PEARL)
MIT License
472 stars 125 forks source link

what's the meaning of AverageTrainReturn_all_train_tasks, AverageReturn_all_train_tasks, AverageReturn_all_test_tasks? #27

Closed tianyma closed 3 years ago

tianyma commented 3 years ago

Hi, there, thank you for your excellent work. In the output log I found three types of return: AverageTrainReturn_all_train_tasks, AverageReturn_all_train_tasks, AverageReturn_all_test_tasks. What's the difference among them and which one is the return in your paper?

katerakelly commented 3 years ago

Hello, thanks for your interest in our work!

AverageTrainReturn_all_train_tasks - average return achieved by an agent in a sampling of training tasks using context sampled from the replay buffer (implemented here: https://github.com/katerakelly/oyster/blob/44e20fddf181d8ca3852bdf9b6927d6b8c6f48fc/rlkit/core/rl_algorithm.py#L415-L436) AverageReturn_all_train_tasks - average return achieved by an agent in a sampling of training tasks using context sampled by the current policy AverageReturn_all_test_tasks - average return achieved by an agent in a sampling of testing tasks using context sampled by the current policy (these last two implemented via this function: https://github.com/katerakelly/oyster/blob/44e20fddf181d8ca3852bdf9b6927d6b8c6f48fc/rlkit/core/rl_algorithm.py#L379-L395)

The final metric is the one reported in our paper.