hill-a / stable-baselines

A fork of OpenAI Baselines, implementations of reinforcement learning algorithms
http://stable-baselines.readthedocs.io/
MIT License
4.16k stars 725 forks source link

[question] Does LSTM state gets reset to all zeros for each episode, mini-batch, etc. ? #1022

Closed denyHell closed 4 years ago

denyHell commented 4 years ago

As titled.

If not, how can we do that?

Miffyli commented 4 years ago

Related issue #759

Hidden states are reset for each episode (but gradient is only computed over the currently collected batch of samples). For manually resetting state see this comment.

denyHell commented 4 years ago

yes i just the lines cell_state = cell_state * (1 - mask) hidden = hidden * (1 - mask) which does what you say.

Thanks a lot!