twni2016 / pomdp-baselines

Simple (but often Strong) Baselines for POMDPs in PyTorch, ICML 2022
https://sites.google.com/view/pomdp-baselines
MIT License
307 stars 42 forks source link

Potential bug? #11

Closed hai-h-nguyen closed 2 years ago

hai-h-nguyen commented 2 years ago

For SACD, can you explain why you do this https://github.com/twni2016/pomdp-baselines/blob/main/policies/models/policy_rnn.py#L235 instead of this (which I think is correct)? https://github.com/twni2016/pomdp-baselines/blob/main/policies/models/policy_rnn.py#L230

twni2016 commented 2 years ago

Because the target value V(s') = E_{a'~ \pi(s')}{Q(s',a')}. I think there is no bug?

hai-h-nguyen commented 2 years ago

Yeah, it's not a bug. Therefore, I closed the issue. Thanks for replying.