RasmusBrostroem / ConnectFourRL

0 stars 0 forks source link

Implement td agent #85

Closed jbirkesteen closed 1 year ago

jbirkesteen commented 1 year ago

Implements our first version of the agent discussed in #81.

Adds the TDAgent Player class, a self-play method to the environment as well as an initial version of a self-training script.

The state representation follows #81. In self-play, the opponent is created by freezing the network at the beginning of each game.