Closed hojoungjang closed 3 years ago
Hmm yea I think we can focus on agent vs agent for now. Either way though we will need to swap the turn each time step method is called, since player moves will eventually call this same method also.
Hmm I think keeping 2 arrays, 1 for agent moves, 1 for enemy moves will be better than clearing the array each time. Eventually we wouldn't want to clear the entire array also, but remove the possible actions from that one piece that is moved.
Okay so I will focus on implementing self-play environment (agent playing both sides) with 2 separate possible_actions
array like this: self.black_possible_actions
and self.red_possible_actions
Description
UPDATED (Ready for merge): Pretty messy diff logs but please review the new changes for me. Main new features and changes:
step()
method in self-playing mode; currently terminating condition is set to death of either generals (no checks and checkmate checking)get_possible_actions()
has new parameter calledplayer
denoting which side is currently playingself.possible_actions
,XiangQiEnv
now has 2 action spaces calledself.agent_actions
andself.enemy_actions
get_actions()
methodType of change
How has this been tested?
unit tests along with CI