CherryPieSexy / imitation_learning

PyTorch implementation of some reinforcement learning algorithms: A2C, PPO, Behavioral Cloning from Observation (BCO), GAIL.
139 stars 15 forks source link

[Question] Why should I set action*2 if I want to get the correct output action in GAIL? #1

Closed LongchaoDa closed 2 years ago

LongchaoDa commented 2 years ago

Hi, Thank you for the implementation of the RL and IL algorithms, It is especially helpful! During realizing some of the models, I had a confusion on the size of output_action settings: For example, If I'm using GAIL, and for generator to output 2 (x,y)value, why should I set the real [ "out_put" * 2 ] ?

Looking forward to your reply! Thank you!

CherryPieSexy commented 2 years ago

Hello! When defining an actor (which is a neural net model) you should specify it's output size. For predicting actions you have to initialize a probability distribution by specifying it's parameters, for example for Normal distribution it is mu and sigma for each action coordinate. So, nn should produce parameters of double size compared to action dim.

It is true for every continuous distribution I've implemented. For discrete distributions you should not double action dim.

LongchaoDa commented 2 years ago

It's so kind of you! Thanks for the answering and I saw the dim was truned to real out_put size (for example: in convert_params_beta() you used action_size = params_size//2)) And I got it now! Thanks again for the clear and extinguished work !

LongchaoDa commented 2 years ago

(Btw: Is it convenient for me to have your email? I'm trying to use GAIL with recurrent policy and I want to discuss more details about this, plus by email we could share screenshots I guess its more effective and I really admire your work so I would like to learn from you as a friend! (If you dont want too many to bother you, You may delete the email account comment after get my greeting email?))

CherryPieSexy commented 2 years ago

Glad to read that you managed to understand. I have the email interga@post-hardcore.ru And telegram CherryPieHSE Feel free to reach me.

LongchaoDa commented 2 years ago

Thanks and My Telegram is: danielsmithda We may add each other both sides as mutual contacts to start a conversation!