When building the R(s, a) matrix from R(s, a, sp) it ensures that R(s, a) is zero for the terminal states.
fixes #27 and also #26
Here is a MWE for #26:
using POMDPs
using DiscreteValueIteration
using SubHunt
using POMDPModelTools
mdp = UnderlyingMDP(SubHuntPOMDP())
policy = solve(ValueIterationSolver(), mdp)
We need to tag a new version of POMDPModelTools.jl, it is an ambiguity error with the convenient implementations that are removed on POMDPModelTools master.
When building the R(s, a) matrix from R(s, a, sp) it ensures that R(s, a) is zero for the terminal states.
fixes #27 and also #26
Here is a MWE for #26: