miroesli / psscscs

Reinforcement Learning Battlesnake
MIT License
2 stars 1 forks source link

The algorithm is almost done. #5

Closed Fool-Yang closed 4 years ago

Fool-Yang commented 4 years ago

I used the algorithm Alphago Zero used to train the network. nature24270.pdf Simple Alpha Zero.pdf

Fool-Yang commented 4 years ago

The only thing left is to complete the code in agent.py, train.py and maybe implement a MCTS as well.

Fool-Yang commented 4 years ago

I realize it is stupid to do the preprocessing from scratch on every single board, as the board is dependent on the previous board, there is no reason to discard that information. I will rewrite the game.py to generate the state instead.


Done