aikupoker / deeper-stacker

DeeperStacker: DeepHoldem Evil Brother
38 stars 3 forks source link

Multiplayer #2

Closed herrefirh closed 6 years ago

herrefirh commented 6 years ago

Hey @aikupoker ,

Since you seem to maintain I thought to write. Can you give me an overview of what needs to change to make this multiplayer (3-6)?

I have built NN's before, though I'm not a pro, pokerapps, and read the deepstack paper (though my brain is a bit sluggish today), so I have learned from this experience, that it's better just to ask someone with more experience before digging too deep and a week or two or three just disappears).

Seems like would have to redo terminal_equity.lua ... 1 player folds, 2 player folds and so on. Hopefully it's seat independent. (that is I don't have to have separate matrices for WHICH player folded).

Then don't have any idea what changes to make to the NN. I usually just waste a lot of time there.

Final concern is, I don't really need much training data about 6 players on river... I could be happy with 6 pf, <=3 fl,tu,ri.

Just if you've thought about it or have any guidance. I can fork it if you want to see my changes.

aikupoker commented 6 years ago

Hi @herrefirh

There a lot of code validations that only checks for two players, these should be adapted.

About Terminal Equity:

I think that you are in river with four players, you will have to calculate for each player, their uniform distribution (1/hand plus each chance actions). So, at the end you will need to have a better hand to beat four same hand distribution (more thigh, less hands played..).

About ACPC Protocol:

Source/ACPC/acpc_game.lua is designed to be for just two player and it should be refactored.

ACPC protocol can handles Ring Games (just check last page): http://www.computerpokercompetition.org/downloads/documents/protocols/protocol.pdf

herrefirh commented 6 years ago

So far today it looks like almost every file needs changing, and thinking. I may look for a different approach. I'm not sure I'll have time this week to completely revamp it.

aikupoker commented 6 years ago

Good luck! If you want to share it, I'll be fine! :+1:

herrefirh commented 6 years ago

So, I don't know quite how to accomplish this: (1/hand plus each chance actions)

What I observe so far... the call matrix serves as a mask to the opponent range of cfvs (i think?)

The call matrix [1326x1326] between two people is created by rotating one matrix of hand strengths 90 degrees counterclockwise and comparing them. I wonder if we can just have number of hand matrices equal to number of players, and rotate them in different ways and then rather than a greater than comparison we use a max function.

So like this:

a a a a       a b c d        d  c  b  a     d d d d                             
b b b b       a b c d        d  c  b  a     c c c c
c c c c       a b c d        d  c  b  a     b b b b
d d d d       a b c d        d  c  b  a     a a a a

Probably this just shows how bad I am at math... but I don't know. Otherwise I feel we'd have to have multiple opponent cfvs which then I think changes the network as well

Maybe, for the player (since this is just data generation), I'd end up needing to adjust the network anyways, I'm not sure.

I don't know, maybe it is necessary to also have 4 player ranges but combine the opponent ranges into one as well. Don't know.

Thoughts?