RasmusBrostroem / ConnectFourRL

0 stars 0 forks source link

Use any reward for update #106

Closed jbirkesteen closed 1 year ago

jbirkesteen commented 1 year ago

closes #97 Changed incremental_update() such that it always uses the last appended reward (except at t0).

Also added the argument terminal_state. This was done to prevent a possible bug with the old check for clean-up. In the special case where not_ended_reward is not unique (e.g. equal to tie_reward) the clean-up would initiate prematurely.