tanliyon / gym-xiangqi

This repo sets up the environment to play Xiang Qi (chinese chess) following the OpenAI Gym framework.
GNU Lesser General Public License v3.0
32 stars 6 forks source link

Implement step() method #19

Closed hojoungjang closed 3 years ago

hojoungjang commented 3 years ago

Description

This new change implements the environment's step() method. This is an incomplete version since we need to also check check and checkmate conditions to update done flag. This PR is just to clarify a few things and get a review on changes made thus far.

There are 2 things to clarify when implementing step() method:

  1. Should we stick to self-playing environment for now?
  2. If so, we should think about self.possible_actions. If an agent plays both sides, red and black, I think we could either keep both sides' possible actions or reset and calculate current player's (red or black) possible actions at every call to step(). Let me know if anything is unclear.

UPDATED (Ready for merge): Pretty messy diff logs but please review the new changes for me. Main new features and changes:

Type of change

How has this been tested?

unit tests along with CI

hojoungjang commented 3 years ago

Hmm yea I think we can focus on agent vs agent for now. Either way though we will need to swap the turn each time step method is called, since player moves will eventually call this same method also.

Hmm I think keeping 2 arrays, 1 for agent moves, 1 for enemy moves will be better than clearing the array each time. Eventually we wouldn't want to clear the entire array also, but remove the possible actions from that one piece that is moved.

Okay so I will focus on implementing self-play environment (agent playing both sides) with 2 separate possible_actions array like this: self.black_possible_actions and self.red_possible_actions