Closed FlyingWorkshop closed 2 months ago
@FlyingWorkshop did you close this because it was resolved? If it can throw an error when it samples a terminal state, we should probably fix that. I think @gen
should only sample non-terminal states.
@zsunberg, I closed it b/c I took a look at the source code again and realized that while isterminal(::GenerativeBeliefMDP, b)
is true when all support states are terminal, the @gen
block actually checks is the sampled state is terminal:isterminal(bmdp.pomdp, rand(rng, b))
. There is an issue w/ sampling terminal states, but I think this is expected behavior b/c there's a overwritable handler for this.
I was testing this on TMaze
and there would be an error when we fell back on the default handler b/c reward(::TMaze, ::TerminalState, ::Int)
isn't defined. I ended up just implemented the reward function, so don't think this is still an issue.
@zsunberg was this resolved with https://github.com/JuliaPOMDP/POMDPs.jl/pull/559?
Well, "resolved" is an optimistic word, but yes, I think the behavior was changed in a positive way, so I am closing this :)
This isn't an error per say, but
isterminal
inGenerativeBeliefMDP
is only true when the entire support is terminal, but when@gen
is called, we sample single state from the belief, meaningisterminal(bmdp, b = false)
, but@gen(b, a)
may fail because a terminal state is sampled. Is there a good reason to define "terminal belief" this way? I agree that checking if any terminal state is in the support might be overly reactive. This becomes a problem, for instance, when we want to generate a bunch of samples and early exit on a "terminal belief."