Open zsunberg opened 1 year ago
This isn't an issue if the observation function is defined correctly, right?
In update(bu::DiscreteUpdater, b::DiscreteBelief, a, o)
, we have
for (sp, tp) in weighted_iterator(td)
spi = stateindex(pomdp, sp)
op = obs_weight(pomdp, s, a, sp, o)
bp[spi] += op * tp * b.b[si]
end
My thoughts are if we aren't expecting an observation if sp
is terminal, then that should be reflected in op = obs_weight(pomdp, s, a, sp, o)
.
Is there a case you saw or were thinking of where this isn't the case?
I think solvers, belief updaters, and simulators should avoid calling obs_weight
or observation
when s
is terminal. We should not force problem implementers to define obs_weight
or observation
when s
is terminal since it is often not clear what observation distribution should be returned, and it will just force them to write dummy code to handle that case.
That makes sense to me.
Currently the DiscreteUpdater does not check for terminal states.
If isterminal and the simulator are implemented correctly, if we receive an observation, we should be able to conclude that we were not in a terminal state.