Description

This new change implements the environment's step() method. This is an incomplete version since we need to also check check and checkmate conditions to update done flag. This PR is just to clarify a few things and get a review on changes made thus far.

There are 2 things to clarify when implementing step() method:

Should we stick to self-playing environment for now?

If so, we should think about self.possible_actions. If an agent plays both sides, red and black, I think we could either keep both sides' possible actions or reset and calculate current player's (red or black) possible actions at every call to step(). Let me know if anything is unclear.

UPDATED (Ready for merge): Pretty messy diff logs but please review the new changes for me. Main new features and changes:

Implement step() method in self-playing mode; currently terminating condition is set to death of either generals (no checks and checkmate checking)
change random agent and its test script accordingly with new changes
get_possible_actions() has new parameter called player denoting which side is currently playing
instead of self.possible_actions, XiangQiEnv now has 2 action spaces called self.agent_actions and self.enemy_actions
Add support for enemy pieces in all piece classes' get_actions() method

Type of change

[ ] Bug fix (non-breaking changes which fixes an issue)
[x] New feature (non-breaking changes which adds certain functionality)
[ ] Documentation update (updating documentations)
[ ] Breaking change (fix that breaks existing functionality)

How has this been tested?

unit tests along with CI

tanliyon / gym-xiangqi

Implement step() method #19

Description

Type of change

How has this been tested?