-
I'm implementing my own RL framework in Jax to better understand RL algorithms and found your code very helpful
Looking at the NoisyNets implementation, on line 316 and 317 (https://github.com/goog…
-
Currently, requires_grad means two things:
1) That we should compute gradients for this variable and functions of this variable
2) On a "leaf" variable, it means we should store the gradient to the…
-
hi @Kismuz
I was reading the paper "Noisy Network for exploration". And have a question w.r.t its usage in btgym. The paper says that "As A3C is an on-policy algorithm the gradients are unbiased w…
-
I'm running the following on 4 GPUs:
```
model = Resnet50()
model = model.cuda()
criterion = nn.CrossEntropyLoss(reduction='mean').cuda()
optimizer = torch.optim.SGD(model.parameters(), 0.001)
…
-
![image](https://user-images.githubusercontent.com/23333028/48664090-cf836680-eadc-11e8-969b-5201db99907d.png)
-
Implement the best practices from multi-agent Rl community and stablebaselines3 into our algorithm. Further analyse similarities between petting zoo multi-agent implementation to current RL implementa…
-
Hello,
I see one commit (6c8b281) which tries to fix the default value of the stddev in Noisy layer but I think this is anyway overridden by the default value in args.py which is 0.1.
Moreover the…
-
I've working with the code in differences environments like pong and another NES games. But almost all the time I see the same pattern, the loss goes down normally but after some point it jumps to a v…
-
After changing environment to `Assault`, the algorithm no longer works, unlike other implementations. Is there any plan for supporting other environments in Atari?
-
Fortunato, Meire, Azar, Mohammad Gheshlaghi, Piot, Bilal, Menick, Jacob, Osband, Ian, Graves, Alex, Mnih, Vlad, Munos, Remi, Hassabis, Demis
http://arxiv.org/abs/1706.10295