Convergence test in q_learning solver

scribby182 commented 5 years ago

Sorry, this should be a PR but I don't have a clean forked version handy to work from and its a really simple issue so...

In the test for convergence of q_learning, I think this line: https://github.com/cmaron/CS-7641-assignments/blob/5a4c511332f0ee764cc768b6bc76a1422a98244c/assignment4/solvers/q_learning.py#L69

should be: self._last_delta = max(self._last_delta, abs(td_delta))

Without abs(), all negative deltas look like convergence. I have a case that approaches true Q from above, so all td_delta's are negative, and it thinks things are converged from iteration 0 even though delta Q's are in the 10's or 100's.

Thanks for the great code to work with!

cmaron commented 5 years ago

Opened a PR with the change. Look ok @scribby182 ?

scribby182 commented 5 years ago

Looks good - thanks!

cmaron / CS-7641-assignments

Convergence test in q_learning solver #23