MatheusMRFM / A3C-LSTM-with-Tensorflow

An implementation of the A3C deep reinforcement learning method using a LSTM layer. Created with Tensorflow.
29 stars 11 forks source link

It is not train #2

Open pppn9595 opened 7 years ago

pppn9595 commented 7 years ago

Hello ! I wrote my own A3C with LSTM, but it was not perfect. When I trained the model with batches it doesn't train, but when I used all episode experiances, it was perfect (only feeded LSTM state once, at the beginning of training ). I tried your code, but I ran into the same issue. I heard maybe it can't train with tensorflow 1.0 or newer version. What tensorflow version do you use? Have you any experiances with Breakout-v0 ?

Thanks

MatheusMRFM commented 7 years ago

Hello there! What do you mean by all episode experiences? Do you mean that you used a Batch Size equals to the number of steps in an episode?

About my code, I ran it with the Pong-Deterministic environment, but I never waited it to converge to the +21 score (I stopped at +15). But I am indeed having difficulty running my code for the Breakout environment, since the rewards are increasing really slowly.....even though I used some parameters used in the original A3C paper....Basically, after about 65k episodes (across all threads), it can only score about 40-50 points (in the average).