It is not train - Githubissues

MatheusMRFM / A3C-LSTM-with-Tensorflow

An implementation of the A3C deep reinforcement learning method using a LSTM layer. Created with Tensorflow.

29 stars 11 forks source link

Hello there! What do you mean by all episode experiences? Do you mean that you used a Batch Size equals to the number of steps in an episode?

About my code, I ran it with the Pong-Deterministic environment, but I never waited it to converge to the +21 score (I stopped at +15). But I am indeed having difficulty running my code for the Breakout environment, since the rewards are increasing really slowly.....even though I used some parameters used in the original A3C paper....Basically, after about 65k episodes (across all threads), it can only score about 40-50 points (in the average).

MatheusMRFM / A3C-LSTM-with-Tensorflow

It is not train #2