Closed mpnunez closed 2 months ago
$value_i = \sum$ of discounted rewards for the next $n$ steps + discounted value of state $n$ steps from now. $n$ can be any value between 1 and the number of steps until the end of the episode.
https://github.com/mpnunez/Connect4-AI/commit/f848e5166e814553b53787a4a3bc035f2ffef315
$value_i = \sum$ of discounted rewards for the next $n$ steps + discounted value of state $n$ steps from now. $n$ can be any value between 1 and the number of steps until the end of the episode.