google-deepmind / open_spiel

OpenSpiel is a collection of environments and algorithms for research in general reinforcement learning and search/planning in games.
Apache License 2.0
4.23k stars 932 forks source link

Solution for Auction Pitch (aka Setback) #452

Closed brianberns closed 3 years ago

brianberns commented 3 years ago

This is more of announcement than an issue, so please feel free to close it. I think the folks who work on this repository might be interested, and couldn't think of a better way of communicating. Apologies if this is considered inappropriate.

I've recently created a Vanilla CFR solution for Setback, using a hand-crafted game abstraction. I'm a .NET developer, so the code is written in F#, rather than using open_spiel, but I think it might still be of interest to this group for two reasons:

The repository is here. Let me know if you have any further interest. Perhaps someone might even want to port the solution to open_spiel. Thanks.

lanctot commented 3 years ago

@brianberns this is really neat! I was not aware of this game, I'd really love to see this game in OpenSpiel too!

Tagging a few people who might be interested: @elkhrt @jhtschultz @inejc @flostim @solinas @ssokota

Some questions and comments:

One neat think about having it in OpenSpiel is that we could try the RL-style generalization via function approximation. E.g. we could run one of the many neural versions of CFR (e.g. NFSP) and see what kind of strategies it generates compared to your abstraction technique.

Very cool, thanks for posting!

brianberns commented 3 years ago

Thank you for the positive response. I'm glad to be considered a member of the community. You all are doing amazing work. My answers below:

Were you aware that this is the approach taken by the first generation of Poker AI agents that played against humans?

By "this approach", do you mean vanilla CFR specifically? If so, yes, I became aware of the poker-oriented history of CFR while I was learning how it works. (Like a lot of people, I naively assumed that poker was largely a game of chance, and was amazed that bluffing can be optimized by a mixed strategy.)

Is the game you solved two-player zero-sum. If so, how is it "bridge-like"?

Like bridge, Setback is typically played by two teams of two players (North/South vs. East/West). And, like bridge, it consists of an "auction" phase followed by a playout phase in which the winner of the auction chooses a trump suit.

Since the members of each team work cooperatively, it can be considered a two-player game for CFR. It can also be considered zero-sum (i.e. any points won by a team are also lost by the other team). If this is not 100% theoretically sound, I can at least attest that it works very well for this particular game, and I strongly suspect that it would work for bridge as well.

Setback is a game that goes back decades in my family. For anyone who's interested in more details, I've attached a paper that my father wrote on the topic some time ago.

Berns on Setback V2.pdf

ssokota commented 3 years ago

Neat! If I understand correctly, you're implementing it as a two-player game with imperfect recall? In general, CFR doesn't have Nash equilibrium convergence guarantees for imperfect recall games so it's interesting to hear that it's so effective for Setback.

brianberns commented 3 years ago

Yes, the abstraction that I'm using "forgets" much of the detailed history (e.g. exactly which cards were played on previous tricks). Intuitively, I was hopeful that this would work, because good human Setback players don't need all of that information either, and it turned out well.

ssokota commented 3 years ago

But even abstraction aside, it's imperfect recall because each team is implemented as a single player, right? Ie, when it's North's turn the North/South player forgets South's hand (and vice versa). Or am I misunderstanding?

brianberns commented 3 years ago

Oh, I see. Yes, that's true. I actually started off with 4-player CFR, but observed that I got the same results faster by simplifying it to two players. (The speed improvement was due to increased CFR pruning, I think, but have not verified.)