Closed nicholas0717 closed 1 year ago
Hi @nicholas0717
Output is (-inf, inf) if you don't specify output_activation
.
https://github.com/toshikwa/gail-airl-ppo.pytorch/blob/master/gail_airl_ppo/network/utils.py#L15-L16
Does it make sense?
Thank you! I get it. I didn't read the utils.py carefully.
Appreciate your reply!
Hi @toshikwa
I'm puzzled with your annotation in
update_disc()
. You said output of discriminator is (-inf, inf), not [0, 1]. Should the output of disc be (-1, 1) whenhidden_activation
of the disc isnn.Tanh()
. I don't know whether I understand it incorrectly.Looking forward to your reply. Thanks!