DiscreteUpdater does not take isterminal into account

JuliaPOMDP / POMDPs.jl

MDPs and POMDPs in Julia - An interface for defining, solving, and simulating fully and partially observable Markov decision processes on discrete and continuous spaces.

http://juliapomdp.github.io/POMDPs.jl/latest/

Other

657 stars 100 forks source link

DiscreteUpdater does not take isterminal into account #471

Open zsunberg opened 1 year ago

zsunberg commented 1 year ago

Currently the DiscreteUpdater does not check for terminal states.

If isterminal and the simulator are implemented correctly, if we receive an observation, we should be able to conclude that we were not in a terminal state.

dylan-asmar commented 7 months ago

This isn't an issue if the observation function is defined correctly, right?

In update(bu::DiscreteUpdater, b::DiscreteBelief, a, o), we have

for (sp, tp) in weighted_iterator(td)
    spi = stateindex(pomdp, sp)
    op = obs_weight(pomdp, s, a, sp, o) 

    bp[spi] += op * tp * b.b[si]
end

My thoughts are if we aren't expecting an observation if sp is terminal, then that should be reflected in op = obs_weight(pomdp, s, a, sp, o).

Is there a case you saw or were thinking of where this isn't the case?

zsunberg commented 6 months ago

I think solvers, belief updaters, and simulators should avoid calling obs_weight or observation when s is terminal. We should not force problem implementers to define obs_weight or observation when s is terminal since it is often not clear what observation distribution should be returned, and it will just force them to write dummy code to handle that case.

dylan-asmar commented 6 months ago

That makes sense to me.