RasmusBrostroem / ConnectFourRL

0 stars 0 forks source link

Incorrect reward calculation when `not_ended_reward` isn't default #76

Open jbirkesteen opened 1 year ago

jbirkesteen commented 1 year ago

When initialising a Player() object, we currently let not_ended_reward be passed as an argument. However, the method calculate_rewards() is hardcoded for not_ended_reward=0. I'm not sure changing it is as simple as just replacing the 0 with self.params["not_ended_reward"], since we use discounting. There might be a simple way of fixing it, haven't given it that much thought yet 🤔