Closed Maryamr314 closed 4 years ago
MultiTaskSampler
, which is responsible for sampling the trajectories, is doing adaptation locally in each worker.
https://github.com/tristandeleu/pytorch-maml-rl/blob/0c2c7ddbcdb0065c5d0125edd38c82546b965ec4/maml_rl/samplers/multi_task_sampler.py#L251-L275
So in test.py
, you do get both trajectories before and after adaptation with the simple call to MultiTaskSampler
. And with a few changes to test.py
you can even use different number of gradient steps for adaptation by changing num_steps
in your call to sampler.sample()
.
Thanks, That was really helpful.
Sorry for opening this issue again but after changing num_steps I didn't get better results!! (number near to MAML show num-batches)
What is the environment? Making sure you get better performance with a larger number of gradient steps at test time is not something I tested.
Sorry for bothering you. It was my mistake. I found out if I lower the learning rate at both test and train time I can get better performance. (my environment is half_cheetah_vel)
Hi, Apologize me if the question is a little dumb. But I can't figure out what's going on in test.py. Is there any learning phase in it? If not how can I test gradient update and if so where does model learn?