Open zsunberg opened 4 years ago
@zsunberg, can you provide a link/reference to where this issue is in the docs? The only thing I can find is the online solver example (https://juliapomdp.github.io/POMDPs.jl/latest/online_solver/). In my last update, I added a heuristic policy example (https://juliapomdp.github.io/POMDPs.jl/latest/example_solvers/#Heuristic-Policy) but I'm not sure that addresses this issue.
Since it uses SimpleGridWorld and it gives reward for taking an action in a reward state, there is nothing interesting that the solver can figure out, i.e., if the grid world was 3 x 1, you could do
and it would still output
:right