lorserker / ben

a game engine for bridge
GNU General Public License v3.0
39 stars 30 forks source link

Sampling problem #91

Open ThorvaldAagaard opened 9 months ago

ThorvaldAagaard commented 9 months ago

I have hit a problem I can't see a good solution of.

BEN opens 1 Spade and next hand bids 3D (weak)

Now when we sample, we check how good the hands match the bidding by asking the neural network.

Using the current neural network 3D is a 7-card suit, and depending on how many times we sample we might find enough boards, that match the bidding. If we did not find enough boards we could increase the samples, but that would just give us more boards with a 7-card suit (overfitting).

Now the problem is that BEN is playing against a human (or another robot), where a hand like

X XX KQJTXX QJTX

is fine for a preempt (just like real bridge).

When sampling BEN will never find this type of hand as this hand returns 0.99 for Pass from the neural network.

So the only way I see a solution for this is to add boards like this into the training, but BBA will also Pass, so we will have to manually adding this board to the training,

Now if it was a single situation it could be handled, but this kind of deviation happens all the time in the bidding (ie open 1N showing 15-17 on 14 HCP)

The fundamental problem is that we generate a hand and then discards it it doesn't match the bidding, and what we would like to know is more the probability that the hand could make that bid.

ThePokerDude commented 6 months ago

So the problem is how the training examples are created. When generating the learning examples X XX KQJTXX QJTX 1) should be always a preemptive 3d bid or 2) there could be a mixed startegy applied - like 50% pass, 50% 3d

I think poker AIs apply this mixed startegy approach.

ThorvaldAagaard commented 6 months ago

The training will normally just include a 3D-opening for this hand, and that will during training create neurons, so BEN would open 3D - So it is much true / false. When BEN knows how to bid decent, then it can start experimenting :-)

But we have another issue with this as most 3D are 7-cards, so when simulating it might not hit a hand like this one, when trying to find samples for a 3D-opening

ThorvaldAagaard commented 6 months ago

Using the new play engine, this is no longer as an issue as all constraints will be with an uncertainty (+- 1 card)

lorserker commented 6 months ago

i suggest to accept/reject the sample based on the log-likelihood of the bidding.

for the hidden hand(s) take every bid they made. for each of those bids take the NN score of the bid (probability of the bid) take the logarithm of the probability. sum the logs -> that is the log likelihood.

now decide to accept/reject based on the log-likelihood. the higher the value the more likely to accept.

ThorvaldAagaard commented 6 months ago

I will look at that. Currently I am using euclidian distance (See the bidding like a graph), and the adding weight to partners bid