Model training issues - Githubissues

Reward is not increasing as the model trains which indicates that there is something wrong with the training process. Possible errors include:

Tracked grid input is not well defined enough to aid model in finding the correct areas to travel to.
Algorithm is implemented correctly so learning is difficult (?)
Steps are not run for long enough per episode for model to learn the environment that it's actually exploring
Something wrong with the hyper-parameters alpha, beta, tau (least likely reason)

aLadNamedPat / SPUR