Chapter 6: Random Walk --> Infinite loop

Hello,

Very nice implementation of the examples from the book. Really helpful for better understanding of the ideas!

figure6_3(): It seems for big alphas(e.g. 0.1) the batch updates for random walk example sometimes are entering an infinite loop while trying to find convergence for the updates array. The deltas are crawling towards the 0.001 threshold very slowly and the program seems to hang. Trying to put a "max_iterations" parameter instead of "while True" fixes this but causes the next episode to increase each subsequent delta instead of decreasing it while the iterations are progressing though.

Sometimes the deltas seem to be increasing during the "while True" iterations instead of going down even without a preceding loop termination due to "max_iterations"(in case "while True" was replaced with "while iteration < max_iterations").

Do you had any problems of that kind and do you have any idea why this might be happening?

Thank you

ShangtongZhang / reinforcement-learning-an-introduction

Chapter 6: Random Walk --> Infinite loop #72