Closed ZhenZhuHuang closed 2 years ago
The reason why I do us = means + noises * log_stds.exp()
is called "reparametrization trick".
Try calculating the log probability of gaussian distribution using noises[i] = N(\mu=0, \sigma=I)
, you'll see.
I avoided using torch.distributions.Normal().log_prob()
in order to reduce calculation.
Does it make sense?
What I don't understand is that the calculation formula of normal. Log_prob seems to be different from your code, so it is confused. Thank you for your answer, Sir. ------------------ 原始邮件 ------------------ 发件人: "ku2482/gail-airl-ppo.pytorch" @.>; 发送时间: 2022年8月15日(星期一) 下午3:23 @.>; @.**@.>; 主题: Re: [ku2482/gail-airl-ppo.pytorch] about reparametrize (Issue #7)
The reason why I do us = means + noises * log_stds.exp() is called "reparametrization trick". Try calculating the log probability of gaussian distribution using noises[i] = N(\mu=0, \sigma=I), you'll see.
I avoided using torch.distributions.Normal().log_prob() in order to reduce calculation.
Does it make sense?
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>
Normal(means, stds).log_prob(means + noises * stds)
equals to Normal(0, stds).log_prob(noises * stds)
.
So, I can reduce unnecessary calculation.
Okay, I think I get it. One last question, does GAIL based on PPO and SAC have any relevant papers or blogs? Thank you for your help, Sir.
------------------ 原始邮件 ------------------ 发件人: "ku2482/gail-airl-ppo.pytorch" @.>; 发送时间: 2022年8月15日(星期一) 下午4:25 @.>; @.**@.>; 主题: Re: [ku2482/gail-airl-ppo.pytorch] about reparametrize (Issue #7)
Normal(means, stds).log_prob(means + noises stds) equals to Normal(0, stds).log_prob(noises stds). So, I can reduce unnecessary calculation.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>
I'm sorry, but I don't remember...
Well, thank you for your patience!
------------------ 原始邮件 ------------------ 发件人: "ku2482/gail-airl-ppo.pytorch" @.>; 发送时间: 2022年8月15日(星期一) 下午5:06 @.>; @.**@.>; 主题: Re: [ku2482/gail-airl-ppo.pytorch] about reparametrize (Issue #7)
I'm sorry, but I don't remember...
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>
Hello, I would like to ask why the calculate_log_pi function calculates the logpi this way. I can't find the algorithm, okay?