JuliaPOMDP / POMDPs.jl

MDPs and POMDPs in Julia - An interface for defining, solving, and simulating fully and partially observable Markov decision processes on discrete and continuous spaces.
http://juliapomdp.github.io/POMDPs.jl/latest/
Other
664 stars 100 forks source link

GenerativeBeliefMDP isterminal is only true when the support is entirely terminal #544

Closed FlyingWorkshop closed 2 months ago

FlyingWorkshop commented 6 months ago

This isn't an error per say, but isterminal in GenerativeBeliefMDP is only true when the entire support is terminal, but when @gen is called, we sample single state from the belief, meaning isterminal(bmdp, b = false), but @gen(b, a) may fail because a terminal state is sampled. Is there a good reason to define "terminal belief" this way? I agree that checking if any terminal state is in the support might be overly reactive. This becomes a problem, for instance, when we want to generate a bunch of samples and early exit on a "terminal belief."

zsunberg commented 6 months ago

@FlyingWorkshop did you close this because it was resolved? If it can throw an error when it samples a terminal state, we should probably fix that. I think @gen should only sample non-terminal states.

FlyingWorkshop commented 6 months ago

@zsunberg, I closed it b/c I took a look at the source code again and realized that while isterminal(::GenerativeBeliefMDP, b) is true when all support states are terminal, the @gen block actually checks is the sampled state is terminal:isterminal(bmdp.pomdp, rand(rng, b)). There is an issue w/ sampling terminal states, but I think this is expected behavior b/c there's a overwritable handler for this.

I was testing this on TMaze and there would be an error when we fell back on the default handler b/c reward(::TMaze, ::TerminalState, ::Int) isn't defined. I ended up just implemented the reward function, so don't think this is still an issue.

dylan-asmar commented 2 months ago

@zsunberg was this resolved with https://github.com/JuliaPOMDP/POMDPs.jl/pull/559?

zsunberg commented 2 months ago

Well, "resolved" is an optimistic word, but yes, I think the behavior was changed in a positive way, so I am closing this :)