sawcordwell / pymdptoolbox

Markov Decision Process (MDP) Toolbox for Python
BSD 3-Clause "New" or "Revised" License
518 stars 252 forks source link

ValueIterationGS _boundIter is incorrect #14

Open sawcordwell opened 9 years ago

sawcordwell commented 9 years ago

The _boundIter method causes the algorithm to stop before it has reached the epsilon-optimal value function. Steps to reproduce:

import mdptoolbox.example
P, R = mdptoolbox.example.forest()
vigs = mdptoolbox.mdp.ValueIterationGS(P, R, 0.96)
vigs.setVerbose()
vigs.run()

What is expected: _boundIter should compute something that the algorithm can finish before, or it should not be used.