-
P175中介绍PPO是off-policy的,并通过off-policy中的importance-sampling方法推导PPO的算法,但OpenAI中关于PPO的介绍是on-policy的,推导是TRPO的一阶求解方法。(具体详见:https://spinningup.openai.com/en/latest/algorithms/ppo.html)
-
Hi Vince, many thanks to your fantastic work! I would like to know if there is a plant to support the TRPO algorithm? Thanks a lot!
-
Hi all,
I want to apply reinforcement learning using multi agent, specifically algorithms are PPO, TRPO, DDPG and A2C. I don't understand how to write Carla environment for these algorithm. Is any …
-
Hi,
I am wondering whether chainerrl supports TRPO to run atari? I tried to do so by following the code for training PPO on atari, but I am faced with the following error:
> Traceback (most rec…
-
Training is very slow, need to check the original paper hyper parameters.
-
When I run wgail.py, I find that it seems like missing trpo module? Can you provide some details about running this file? Thanks very much!
-
I want to replace the TRPO with DDPG + HER and am having difficulties. The combination only works with a task that is registered with Gym. How did TRPO avoid that?
-
Hello, thank you for sharing your code.
May I ask a paper question? Since ppo is the upgrade of trpo. Have you considered to use ppo instead of trpo? I am facing this question in my thesis. I wonder…
-
Запили сука новый проект в Visual Studio в том же решении. Назови его TRPO.Database. В него перенеси класс Database из проекта TRPO.Services
-
On executing trpo_continous.py, I get the following error:
> [2017-07-01 23:52:58,375] Making new env: CartPole-v0
> [TL] InputLayer continous_shared/continous_input_layer: (?, 3)
> [TL…