-
Hi @miyosuda, thanks for providing the code! When I experimented it with other games than pong (only the ROM name and ACTION_SIZE are modified), I found A3C-FF seems not work very well. For example, a…
-
While trying the a3c example provided I encountered the following error:
```
Training model
Training ACAgentRunner...
[2017-04-10 16:30:50,699] Making new env: CartPole-v0
Training ACAgentRunne…
-
Did anyone managed to get the A3C LSTM of this repo to work for Pong (using the openai gym)?
I have already tried several different optimizers, learning rates, network architectures, but still no …
-
Hi
Our goal is to minimize the loss. Loss consists of three parts:
- Value loss
- Policy loss
- Entropy (to encourage exploration)
As follows:
```
self.value_loss = 0.5 * tf.reduce_su…
-
hi
Based on [ViZDoom](http://www.cs.put.poznan.pl/wjaskowski/pub/papers/Kempka2016ViZDoom.pdf) paper figure 7, I tried to use skip count to speed up the training, as follows:
`r = self.env.ma…
-
Guys, Keras-rl is the best reinforcement learning library.
easy to handle despite complex rl algorithmic.
Keras-rl is far moore better than stable baseline.
please add ppo, a3c and other as dqn is …
-
莫凡您好,我最近用您的a3c,看代码中有些疑惑向您请教:
1. A3C_RNN.PY的150行中,buffer_r.append((r+8)/8),这里为何要把奖励这样变呢?
2. 186行中,GLOBAL_RUNNING_R.append(0.9 * GLOBAL_RUNNING_R[-1] + 0.1 * ep_r),用于显示的总奖励为何要这样算呢?
-
I want to run the extended library done by frenkowski but I'm having trouble installing the version suggested in this library and I can't fix it.
Is the problem with the Python version? What version …
-
I do have some code for a Stock Trading game that is using Deep Q ( just standard Deep Q Learning with Experience Play, but i would like to use A3C LSTM with Experience Play as per the research paper …
-
Hello everybody.
After learning 5 agents in harvest environment using A3C algorithm and for example baseline method. How can I save the movie of a test run using saved model and checkpoints?
I tried…