-
### Environment
MekHQ 0.47.16
Java version 15.0.1
Platform Mac OS X 10.15.7 (x86_64)
### Description
When you start getting into dozens of units and a handful of forces, it can be desirable…
-
This thread is used for sharing experiment results. I'd appreciate if you could write your experiment result to this thread when you try my code. The following messages are sample reports.
-
@Kismuz,
I believe I have encountered a framework (A3C) limitation.
While training a few of my recent models I noticed a strange behavior. For the first part of training everything seems to work fi…
-
-
I notice that image is available through shared memory with C/C++.
But for reinforcement learning, I also need to send control instruction to it. What's more, it's better to get image via python, sin…
-
- https://arxiv.org/abs/1801.10467
- 2018
初心者のプログラマーは、プログラミング言語の形式的な構文に悩まされることが多い。
そこで我々は、強化学習が可能な新しいプログラミング言語修正フレームワークを設計した。
このフレームワークでは、エージェントがテキストのナビゲーションと編集のために人間の動作を模倣することができる。
本研究では、プログラミ…
e4exp updated
3 years ago
-
## Description
- There is an MXNet nightly benchmark which runs CV and NLP models on MXNet Nightly pip wheel and report the metrics and it showed a performance regression on GPU Memory.
- After bise…
-
Hi,
Project is missing "parameters" folder, could you add it, please? :)
-
todo: add KL penalty between current and marginal policy as an intrinsic reward/penalty
log π(a|s)/p(a)
the question is if this will induce perseveration
the only thing to figure out is how to …
-
https://datawhalechina.github.io/easy-rl/#/chapter7/chapter7
Description