Open ThorvaldAagaard opened 8 months ago
I think I have found the problem but not how to solve it
We give the neural network input in bidding rounds and the output is what to bid.
So for our first bid we give this input
PAD_START,2C,PASS and get the output 2D
For our second bid we give this input PAD_START,2C,PASS PASS,2S,PASS
and get the bid 4H
When bidding we are keeping state, so it will work, but when matching samples against the actual bidding we don't keep state, so we will now match bidding like 2C-Pass-2H-Pass-2S-Pass and now 4H shows 6+ hearts, and these are included in training data.
So when validating samples we need state or we need to include my previous bid when finding the next bid.
The fundamental difference between 2c-p-2d-p-2s-p-4h and 2c*-p-2h-p-2s-p-4h is that 2d is artificial and 2h is natural. If the information (alert = artificial) would be inclueded maybe the NN would have an easier time to recognize that a jump bid of a suit never bid naturally before is a splinter bid. But this is pure guessing from me.
I think it is a design flaw in BEN, when matching samples as it has no state. I will try training a new net to see if that will fix it.
cool. Now I see that the information passed to BEN for bidding and sampling does not include it's first bid:
_For our second bid we give this input PADSTART,2C,PASS PASS,2S,PASS 2D is missing.
So if I understand correctly for bidding BEN "knows" it's bid PADSTART,2C,PASS,2D PASS,2S,PASS
And for matching both sequences PADSTART,2C,PASS,2D PASS,2S,PASS and PADSTART,2C,PASS,2H PASS,2S,PASS are taken into concideration as it uses some kind of wild card
PADSTART,2C,PASS,wild card PASS,2S,PASS ?
I'll look into the code and try to understand the learning mechanism.
BEN just send 3 bids to the NN, so that will work as a wildcard. Start reading the readmes in /script/training and let me know if you miss something
On the other hand Tensorflow is holding state, so perhaps I am just not understanding how TF works :-(
BEN failed to bid this slam 4H is a splinter and agreeing spades.
The double from East might have spoiled BEN's network, so I will delete that bid for now, then it might be fixed later.
Looking behind the scenes I think I have discovered, what I think is a serious problem
First I bid this hand
Fine with the splinter (insta_score 0.846) SO this looks like the X of 2S is causing problem for BEN, and that is not that surprising as it had not seen that sequence before.
Then I bid this hand
Hmm. exactly the same sequence. This time 4H only had an insta_score of 0.454.
So looking thru all the training deals i found the following with this bidding sequence
So the input data is correct, where the 4H Bidder are short
Now if I scan for the bidding 2C-2D-2S-4H, it will also happen in other sequences, but not showing a 7-card heart. So where does the neural network get that from.
And I found as expected no training deals where West doubled
Looking at the bidding it seems OK, but later when sampling we will find deals with both long and short hearts, resulting in wrong calculations.
So why is the neural network giving a score of 0.454 to a wrong untrained bid?