change board history, currently the following variables are stored
[current state, legal moves, current player, action, next state, next legal moves, next player, done, winner]
to
[current state, legal moves, current player, action, done winner]
since the next state is selectively chosen using a different logic to store in the replay buffer
also, reducing the objects in this history will help speed up the board augmentation function
resolved in e40ba99
change board history, currently the following variables are stored [current state, legal moves, current player, action, next state, next legal moves, next player, done, winner] to [current state, legal moves, current player, action, done winner] since the next state is selectively chosen using a different logic to store in the replay buffer also, reducing the objects in this history will help speed up the board augmentation function resolved in e40ba99