-
Traceback (most recent call last):
File "TRPO.py", line 169, in
action = agent.act_and_train(obs, reward)
File "C:\Anaconda3\envs\osim-rl2\lib\site-packages\chainerrl\agents\trpo.py", line 680, in …
-
The TRPO and PPO implementations are general enough to be in their own solver package in the POMDPs.jl ecosystem. I've already encapsulated these solvers into the DeepRL module.
Some TODOs:
- [ ] …
mossr updated
4 years ago
-
RLLAB works fine on the sentOS server, but it does not work on my MAC. When I ran trpo_cartpole.py in examples it stuck here and made no progress:
python trpo_cartpole.py
/Users/lchenat/anaconda2/li…
-
''python example/trpo_swimmer.py'' works well. In the default setting, after 40 iterations it produces 55.72 average reward.
When I try to run trpo_swimmer.py in the ''stub'' mode (I simply add ''…
-
Hello everybody!
I'm trying to train a 6 DOF robotic arm (UR5) to reach a 3D goal in its reachable space with DDPG, TRPO, ecc.
I've created my own MuJoCo asset and Gym environment to be launched in …
-
Hi,
I am getting an error when running the examples:
`Traceback (most recent call last):
File "rllab/examples/trpo_cartpole.py", line 1, in
from rllab.algos.trpo import TRPO
File "/…
-
Why in TRPO tf1 implementation update is subtracted. Opposite to what is said in paper and SpinUp doc.
```
def set_and_eval(step):
sess.run(set_pi_params, feed_dict={v_ph: old_params - alpha …
-
Implement the main agent with Trust Region Policy Optimization (TRPO, see [Link](https://arxiv.org/abs/1502.05477))
- [x] Set up InvertedPendulum environment in OpenAI Gym
- [x] Set up neural net an…
-
Using Tensorflow TRPO for the OpenAI gym MountainCar-V0 environment doesn't converge every run. Some runs might converge to a good policy. Others will stay at -200 reward forever.
Gist of code atte…
-
I am running the trpo_mpi code on the version1 branch. When I run the experiment it is waiting for a random port. "Waiting for server on 33791..."
It shows a different port for multi-threaded. I tri…