-
I trained your ppo first.
` python examples/ppo_gym.py --env-name Ant-v2 --save-model-interval 100 `
After 500 episodes, I made trajectories.
` python gail/save_expert_traj.py --model-path as…
-
env--reacher
algo--ppo
error:
Traceback (most recent call last):
File "/home/al/Desktop/pytorch-a2c-ppo-acktr-gail-master/main.py", line 196, in
main()
File "/home/al/Desktop/pytorch-…
-
Hello,
I followed steps mentioned [here](https://github.com/ikostrikov/pytorch-a2c-ppo-acktr-gail#requirements) to install requirements for this repository. There is one minor change, I am using vi…
-
Is there a reason you calculate the reward the way you do in line 69?
https://github.com/toshikwa/gail-airl-ppo.pytorch/blob/4e13a23454600a16d5aeeeb4c09338308115455e/gail_airl_ppo/algo/airl.py#L69
…
-
## Problem
Hi, the `imitation` is a great project!
Currently, I am training GAIL algorithm, and the learner network is PPO in SB3. I have questions about the training process for `imitation\GAIL\t…
-
I noticed that the predict reward function uses log(D(.)) - log(1-D(.)) as the reward to update the generator. However, this is the reward function proposed in the AIRL paper which minimizes the rever…
-
To improve stability and robustness of policy, implement proximal policy optimization (PPO):
- https://arxiv.org/abs/1707.06347
- code: https://github.com/ikostrikov/pytorch-a2c-ppo-acktr-gail
-
Hello, thank you for sharing your code.
May I ask a paper question? Since ppo is the upgrade of trpo. Have you considered to use ppo instead of trpo? I am facing this question in my thesis. I wonder…
-
Hi JongseongChae, I am wondering how to only run Gail algorithm using your code. Could you give me the run example for gail?
-
I have read sample examples project code, the RL sample was in python code ,
need one in c#.can we expect other RL algos like a3c ,gail , rainbow, PPO in the future?