PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).
Hi, The two lines referenced below seem to have canceling effects (the second quoted line is the inverse of the sigmoid). I was wondering what has been the purpose of putting them. https://github.com/ikostrikov/pytorch-a2c-ppo-acktr-gail/blob/1120cdfe94e79294a52486590d9c2bcc5c01730d/a2c_ppo_acktr/algo/gail.py#L101-L102
I think if the purpose has been to make this a Wasserstein GAIL, it would be nice to do sth like
if args.wasserstein... else...