Incorrect reward calculation when `not_ended_reward` isn't default

When initialising a Player() object, we currently let not_ended_reward be passed as an argument. However, the method calculate_rewards() is hardcoded for not_ended_reward=0. I'm not sure changing it is as simple as just replacing the 0 with self.params["not_ended_reward"], since we use discounting. There might be a simple way of fixing it, haven't given it that much thought yet 🤔

RasmusBrostroem / ConnectFourRL

Incorrect reward calculation when `not_ended_reward` isn't default #76