Why the previous state_list add the inital observation twice in runner.py in misc file?

Niyx52094 commented 3 years ago

hello, this work about meta q learning is very inspiring and I tried to implement it in my project. When I checked the code I just found the previous list of observations in runner.py in the misc file just adds the obs twice, I'm not sure whether it is a mistake or not? Based on the paper, these previous lists of obs, rewards, actions are input into GPU and generate context variables. If so, the obs should only be added once right? I mean, should the obs in line 80 be added randomly or np.zeros()? e4b412b87122fb8d470fb7a35b4b9de 282fb02e04c6bd5271c27e732df12cf

rasoolfa commented 3 years ago

Hi there,

It is only for the first time step that makes the model to select first action based on starting observation (see https://github.com/amazon-research/meta-q-learning/blob/master/misc/runner_multi_snapshot.py#L101). That said, I don't think initializing observations history has a significant effect on results, thus you can just simply use zero at line 80 instead.

Hope that helps. Rasool

Niyx52094 commented 3 years ago

got it. thank you!

amazon-science / meta-q-learning

Why the previous state_list add the inital observation twice in runner.py in misc file? #4