Open utterances-bot opened 1 year ago
Hi, thank you for the example. It helped me a lot of understanding the concepts. However, I find it a pity that there is so little description and commentary in the lower part. It took me some time understanding what the code snippets do.
Two mistakes in the code:
1) if (s[0] + s[1] + s[3] ... should be if (s[0] + s[1] + s[2] (this should be fixed multiple times)
2) At play_v_random the variable Q_values is not defined.
Q-Learning - A Random Walk
I’ve been reading some books on machine learning, and recently started going through Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow by Aurélien Géron. His chapter on reinforcement learning is great, and it inspired me to apply some of the methods on my own. I decided to start with the game of tic-tac-toe since it’s a little more complicated than a trivial simple example but not as complicated as chess or go.
https://jfking50.github.io/qlearning/