MatheusMRFM / A3C-LSTM-with-Tensorflow

An implementation of the A3C deep reinforcement learning method using a LSTM layer. Created with Tensorflow.
29 stars 11 forks source link

Can you illiterate more in the initial state of the LSTM cell #5

Open shamanez opened 5 years ago

shamanez commented 5 years ago

Here, after a given number of episodes(Bath Size) we train the A3C agent with calculating the return. So we need to feed states, return, advantage function as a batch to optimize. But we only feed the initial state of the LSTM layer for the whole batch as the one which was there in the starting point. Is that correct? Can't we collect all the initial states that changed during an episode and feed them as a batch?