-
Hello, I hope you're doing well,
I have been attempting to replicate BoB's publication results, and although at surface level I appear to be getting some of that, after digging deeper and doing my …
-
Thanks for the nice code. I am trying to re-produce the result in "Pendulum-V0" using a3c_cont.py but it seems the model fail to converge. I have tried various method like experience reply but still n…
-
I came across this repo from the DJL repo, and you might be interested in the ONNX export functionality we built in Tribuo's last release (v4.2.0). We have a separate module (which only depends on pro…
-
CURL: Contrastive Unsupervised Representations for Reinforcement Learning, [paper](https://proceedings.icml.cc/static/paper_files/icml/2020/5951-Paper.pdf).
-
Hey there,
I also use mcts to predict good actions. However in my case (multi player card game) it is very expensive to look ahead very far. For this reason I want to ask you if you know if there is …
-
Please put an option to disable this.
-
Pose a question about one of the following articles:
“[Human-level control through deep reinforcement learning](https://www.nature.com/articles/nature14236)” 2015. V. Mnih...D. Hassabis. Nature 51…
-
### Is your proposal related to a problem?
We would like to have the ability to ask "follow-up questions" based on the answers to previous questions in the booking request. For example, we could h…
-
Overcoming Exploration in Reinforcement Learning with Demonstrations
Ashvin Nair, Bob McGrew, Marcin Andrychowicz, Wojciech Zaremba, Pieter Abbeel
8 pages, ICRA 2018
https://arxiv.org/abs/1709.10089
-
Hi,
Thank you for your dedicated work of PCC-Uspace.
When I followed the instruction in Deep_Learning_Readme.md, I found that values of both Reward and Ewma Reward were so high as the snapshot…
Enjia updated
4 years ago