google-deepmind / open_spiel

OpenSpiel is a collection of environments and algorithms for research in general reinforcement learning and search/planning in games.
Apache License 2.0
4.23k stars 932 forks source link

Kuhn Poker Documentation: #475

Closed xkianteb closed 3 years ago

xkianteb commented 3 years ago

Is there any documentation explaining what each of the 11 indices in the state space of Kuh Poker represents? I understand the gist of the game, but I need to know what the indices mean -- in order to compare my final models against the Nash equilibrium strategies.

lanctot commented 3 years ago

Hi @xkianteb , the contents of the tensor are encoded here: https://github.com/deepmind/open_spiel/blob/b8c2ff8e9a4f5dad9b179217f740ddb0df967f7c/open_spiel/games/kuhn_poker.cc#L70

Though it is certainly stilo not obvious, we should have an English description in comments there.

@elkhrt @michalsustr I believe one of you changed this, can you give us the plain English description, along with an example? (I will then add it to the code in that spot.)

xkianteb commented 3 years ago

@lanctot Thank you :)

michalsustr commented 3 years ago

Indeed for Kuhn we can put a detailed comment.

Something along the lines of:

  // WriteTensor encodes the observation for the current state.
  //
  // For example, for a state with history of actions "0 2 0 1 1", i.e. after:
  //
  //  - 0: player 0 was dealt card 0,
  //  - 2: player 1 was dealt card 2,
  //  - 0: player 0 passed
  //  - 1: player 1 bet
  //  - 1: player 0 bet
  //
  // the observations can vary, depending on the IIGObservationType specified 
  // in the constructor:
  //
  // - kDefaultObsType
  //   player: ◉◯
  //   private_card: ◉◯◯
  //   pot_contribution = [2.0, 2.0]
  //
  // - kInfoStateObsType
  //   player: ◉◯
  //   private_card: ◉◯◯
  //   betting: ◉◯
  //            ◯◉
  //            ◯◉
  //
  // - kPublicObsType
  //   pot_contribution = [2.0, 2.0]
  //
  // - kPublicStateObsType
  //   betting: ◉◯
  //            ◯◉
  //            ◯◉
  //
  // For more information, please see comments in the observer.h file.

Also I made a program called SpielViz which does this nicely (under the Observations card).

xkianteb commented 3 years ago

For Kuhn Poker with two players; what do the last two dimensions of the state space represent?

Player 1's state space:

0: 1 if the current player is player 1 1: 1 if the current player is player 2 2: 1 if player 1 dealt card 0 3: 1 if player 1 dealt card 1 4: 1 if player 1 dealt card 2 5: 1 if player 1 'Pass' 6: 1 if player 1 'Bet' 7: 1 if player 2 'Pass' 8: 1 if player 2 'Bet' 9: 10:

Player 2's state space:

0: 1 if the current player is player 1 1: 1 if the current player is player 2 2: 1 if player 2 dealt card 0 3: 1 if player 2 dealt card 1 4: 1 if player 2 dealt card 2 5: 1 if player 1 'Pass' 6: 1 if player 1 'Bet' 7: 1 if player 2 'Pass' 8: 1 if player 2 'Bet' 9: 10:

michalsustr commented 3 years ago

9 and 10 are if player 0 passes / bets -- if the sequence is player 0 passes, player 1 bets, then player 0 gets to act again:

image

xkianteb commented 3 years ago

@michalsustr I am completely sold on the GUI :) Thanks