calculateReward only runs at the end

snowfrogdev / macao

A general purpose game playing A.I. framework based on the Monte Carlo tree search algorithm.

MIT License

23 stars 4 forks source link

calculateReward only runs at the end #10

Closed flesler closed 5 years ago

flesler commented 5 years ago

Some games have incremental rewards based on actions. It's not the best example because it's just an AI but a snake AI should be rewarded every time it eats and the scenario where it eats should be very desirable. Should the reward function run on every turn and its score accumulated? Ofc it can be done by the user of the library in the state but I'm wondering if it helps the MCT go for desirable scenarios mid-game

snowfrogdev commented 5 years ago

@flesler sorry, I had not seen your issue and wasn't doing much with this repo until recently. The MCTS algorithm simulates an entire game, many many times and based on the final outcome of each simulation, makes a recommendation on the next move to play. In the case of the Snake game, if the calculateReward function is based on the score, which itself is based on eating, it would optimize for that automatically.