Closed xlnwel closed 5 years ago
That is correct. That section of the code evaluates train tasks to see how well the policy performs given the embeddings it is being optimized for during meta-training (rather than embeddings computed from freshly gathered exploratory data as in the actual test setting).
The evaluation of test tasks occurs here: https://github.com/katerakelly/oyster/blob/cd09c1ae0e69537ca83004ca569574ea80cf3b9c/rlkit/core/rl_algorithm.py#L444
Thanks. My obliviousness.
Hi, thanks for your great work. But it seems that you run the training environments at the evaluation time: https://github.com/katerakelly/oyster/blob/cd09c1ae0e69537ca83004ca569574ea80cf3b9c/rlkit/core/rl_algorithm.py#L413