-
-
Bei bei der Dispo von Zügen von Hand wird ein nen Haufen Zeit verloren, weil Menschen entscheiden müssen, welcher Zug vor welchem fahren wird bei Verspätungen. Sagen wir man hat eine Wende, die es bei…
-
-
Hello! A friend and I prototyped a Tensorboard plugin called Agent for visualizing deep reinforcement learning algorithms. Agent is focused on the *time-step level* - enabling you to step frame-by-fra…
-
- [ ] [system-2-research/README.md at main · open-thought/system-2-research](https://github.com/open-thought/system-2-research/blob/main/README.md?plain=1)
# OpenThought - System 2 Research Links
He…
-
## 一言でいうと
Data Augmentationを用いる強化学習で、事前に表現学習を行いその後に通常通りの強化学習を行う研究。表現学習は時系列が近い状態を近いと(Augmentationをかけても)認識できるよう対照学習を行う。その後強化学習を行う。初回からEnd2Endより高い性能を観測
### 論文リンク
https://arxiv.org/abs/2009.08319…
-
I'm planning to add a smart crawler that takes a set of user-defined objectives and continues crawling to satisfy them. Objectives can be a query requiring a sufficient amount of information to answer…
-
DDPG, A2C, etc other deep reinforcement learning models (value vs policy, actor critic, critic only actor only)
Research paper will be attached below for references, 1-2 more will be a great place …
-
# OpenAI Gym vs TensorFlow for Deep RL
Let me know if there is any APIs that support both research and production
-
- [ ] Introduction RL, figure 7 in the paper
- [ ] SARS Element
- [ ] Environment
- [ ] Agent
- [ ] Training Tuning
- [ ] Live Demo, Howto RL 7