JuliaPOMDP / BasicPOMCP.jl

The PO-UCT algorithm (aka POMCP) implemented in Julia
Other
35 stars 17 forks source link

why the state particle isn't stored as was decribed in the original paper #16

Closed Wu-Chenyang closed 4 years ago

Wu-Chenyang commented 4 years ago

I notice that in this realization of POMCP, the state particle of belief node isn't stored as was described in [1]. I find it's inconvenient when you want to use a rollout policy that take in the current belief and generate an action.

[1] Silver, D., & Veness, J. (2010). Monte-Carlo Planning in Large POMDPs. In Advances in neural information processing systems (pp. 2164–2172). Retrieved from http://discovery.ucl.ac.uk/1347369/

zsunberg commented 4 years ago

Hi @Chengyang-Wu , Thanks for your comment. Actually, storing the state particles would not give you any additional information about the belief for rollouts. The reason is that rollouts are only performed from leaf nodes that have just been created, so every node that is used for a rollout has only a single state particle.

The original paper actually suggests using history-based rollouts which are quite confusing to program, so we added the option to use the single state particle for a fully-observable rollout policy (even though this may over-estimate the value).

Does that make sense?

Wu-Chenyang commented 4 years ago

Thank you very much for your reply. I think your explanation is perfectly correct and it cleared up my confusion.