ikostrikov / pytorch-a2c-ppo-acktr-gail

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).
MIT License
3.53k stars 832 forks source link

Operations that have no effect #281

Open ArashVahabpour opened 3 years ago

ArashVahabpour commented 3 years ago

Hi, The two lines referenced below seem to have canceling effects (the second quoted line is the inverse of the sigmoid). I was wondering what has been the purpose of putting them. https://github.com/ikostrikov/pytorch-a2c-ppo-acktr-gail/blob/1120cdfe94e79294a52486590d9c2bcc5c01730d/a2c_ppo_acktr/algo/gail.py#L101-L102

I think if the purpose has been to make this a Wasserstein GAIL, it would be nice to do sth like if args.wasserstein... else...