Closed ZiggerZZ closed 1 year ago
@elkhrt wdyt?
This seems reasonable to me. I wouldn't bother with absolute_order
- just implement whichever works better.
@elkhrt I would rather implement relative order (absolute_order=false), but it would change the current state representation (and potentially tests). Is it okay for you?
I opened a PR #1118 and updated observation tensors in the tests.
Fixed by #1118 which has now been merged. Closing as resolved, but feel free to re-open if you would like to keep discussing.
Hello,
I would like to propose a bridge state features enhancement to include more tricks from the past play. With 13 tricks every
state.observation_tensor()
would represent the full history of the card playing phase.Here's my proposal: we introduce two params,
num_tricks
that defaults to 2 (as of the current implementation) andabsolute_order
(defaulttrue
), that we can use like that:GAME = pyspiel.load_game("bridge(use_double_dummy_result=false, num_tricks=13,absolute_order=false)")
.absolute_order=true
would indicate that the tricks in the observation tensor would be stored in the absolute order: 1st trick, 2nd trick, ..., current trick, ..., zeros (for futures tricks).absolute_order=false
would indicate that the tricks in the observation tensor would be stored in the relative order: current trick, previous trick, ..., zeros (for futures tricks). I find this representation better for RL algorithms.Here's a working implementation of
num_tricks=13, absolute_order=true
(replace lines 340-360 in bridge.cc):Here's a working implementation of
num_tricks=13, absolute_order=false
(replace lines 340-360 in bridge.cc):If you agree with this proposition, I can generalize the above code for any
num_tricks=2..13
and open a PR.