-
Policy Search
- [ ] [PI2](http://proceedings.mlr.press/v9/theodorou10a/theodorou10a.pdf), is already implemented #28
- [ ] [PoWER](http://www.ias.informatik.tu-darmstadt.de/publications/peters_ADPR…
-
I read your code and implement a version with experience replay.
However, I find that the loss explode after a few frames(almost 1000). Value loss would be very large and action loss would be very ne…
-
I run the Cart_Pole.py with A3C&A2C on linux and got the error.
Traceback (most recent call last):
File "/usr/local/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.r…
-
Hello dear mr. Yuhang Song,
In the paper, it is mentioned that the rewards for action **v** are given by
![rewv](https://user-images.githubusercontent.com/28454109/75548914-044bd080-5a37-11ea-888…
aletd updated
4 years ago
-
In discrete_A3C.py, the res_queue.get() in the main function hangs for a very long time (possibly forever) in Linux, but the entire code works perfectly fine on Windows.
```
workers = [Worker(gnet…
-
## 一言でいうと
強化学習において、特に高次元になると報酬が得られる機会はとても少なくなる。そこで「好奇心」、つまり新規性のある環境への到達について報酬を設定することで学習速度を上げる試み。これによりベースライン(A3C)よりも高い学習性能を記録することができた。Doomとマリオブラザーズのデモ有
![image](https://cloud.githubusercontent.co…
-
When I try to run the saved model as :
``` bash
python demo_a3c_ale.py ../roms/breakout.bin trained_model/breakout_ff/80000000_finish.h5
```
I get an error :
``` bash
ImportError: No module named '…
-
-
The current implementation use Python Threading. I just wonder if this can be switched using Multiprocessing?
Multiprocessing performs much better on multi-core CPUs than Threading in Python.
Is…
-
Do you have plans to implement Value Iteration Networks paper - NIPS 2016 best paper in tensorflow . It would be great and fantastic