Avoid bluffing out cards that could duplicate our own.

flackr commented 1 week ago

Fixes #308. We don't assume that other players will necessarily avoid potential bad touch bluffs.

WillFlame14 commented 1 week ago

I think I'm mixed on this. Bluffs are a good way to touch cards that couldn't otherwise be touched and get a playable at the same time, so in #308 it seems better for muffincake to let robot1 bluff g1 instead of getting it themselves. At the beginning of the game too, I try avoiding cluing 1 if it would touch one that is duplicated in finesse position, since I would like to allow the player to bluff/finesse safely on their turn. So muffincake's 1 clue seems a bit weird and stall/clue-stealy to me, with possibly a good chance robot2 actually has p1.

However, I also think robot2 should probably play in the general case because robot1 cannot perform a Certain Discard, and this ends up being 2 for 2 if a fix is required (better if touching useful cards). Maybe this is too much context to expect the bot to do, but disallowing bluffs completely just because of the chance of duplication exists seems a bit extreme. That does seem to be the purpose of not allowing a Certain Discard, after all. Maybe if there's >= 1/2 chance of duplication then the bot won't give the bluff?

flackr commented 1 week ago

I think I'm mixed on this. Bluffs are a good way to touch cards that couldn't otherwise be touched and get a playable at the same time, so in #308 it seems better for muffincake to let robot1 bluff g1 instead of getting it themselves.

Right, I'm not saying this game in particular is a good example. I've had other games where it bluffs out a 2 that was saved early on. The primary issue with false bluffs is that there's no way to prevent to play.

At the beginning of the game too, I try avoiding cluing 1 if it would touch one that is duplicated in finesse position, since I would like to allow the player to bluff/finesse safely on their turn. So muffincake's 1 clue seems a bit weird and stall/clue-stealy to me, with possibly a good chance robot2 actually has p1.

I suspect the thinking was that it saves the 5 at the same time, but it doesn't necessarily happen close together like this. 2 saves from a while ago is where I've seen this happen before though I don't have the replay links handy at the moment.

However, I also think robot2 should probably play in the general case because robot1 cannot perform a Certain Discard, and this ends up being 2 for 2 if a fix is required (better if touching useful cards). Maybe this is too much context to expect the bot to do, but disallowing bluffs completely just because of the chance of duplication exists seems a bit extreme. That does seem to be the purpose of not allowing a Certain Discard, after all. Maybe if there's >= 1/2 chance of duplication then the bot won't give the bluff?

Right, I agree that if there's a low chance of overlap it's probably good to allow. I was already planning this as a followup. I also think this code needs to be moved anyways since it should still consider that the existence of the bluff prevents some straight finesse interpretations.

flackr commented 1 week ago

I updated the test case to be something more like the real cases I've seen and put in a very naive version of this probability check. I still think this probably needs to move up either to find_clues or interpret_clue, and ideally would account for more complex cases like when you have n cards with m overlapping identities (e.g. three 1's of five possible identities).

flackr commented 1 week ago

Actually, I think the ideal way to do this would be extending the information added to the clue of who could give it in #312 to calculate how well each of those players could identify the potential dupe. This would be a more reliable mechanism that could replace the avoidable_dupe score I added before because that metric assumes that we only care about clued duplicates in the target's hand, and also assumes that anyone else could give the clue - which the new mechanism being added considers who is in position to give the clue in time.

WillFlame14 commented 1 week ago

I updated the test case to be something more like the real cases I've seen and put in a very naive version of this probability check. I still think this probably needs to move up either to find_clues or interpret_clue, and ideally would account for more complex cases like when you have n cards with m overlapping identities (e.g. three 1's of five possible identities).

I think what's there now is okay, maybe a second condition like having 3+ cards that could match all with < 5 inferences would also be sufficient to decline, but I don't think it's worth doing too much math since this threshold is arbitrary anyways.

Actually, I think the ideal way to do this would be extending the information added to the clue of who could give it in #312 to calculate how well each of those players could identify the potential dupe.

Aren't we the only player that could give the bluff due to being in bluff seat? It don't think it matters who else can identify the card.

I also think this code needs to be moved anyways since it should still consider that the existence of the bluff prevents some straight finesse interpretations.

I'm not sure I understand what this means, why would finesse interpretations be blocked?

flackr commented 1 week ago

I'm not sure I understand what this means, why would finesse interpretations be blocked?

Bluffs take precedence over layered finesses: https://hanabi.github.io/level-11/#mistaking-a-layered-finesse-for-a-bluff

So I was worried if we don't identify the bluff and have a situation like so:

** ** ** *2
g2 r2 ** **
** r3 ** **
** ** ** **

If find_connections said that a bluff on the g2 isn't allowed then it might still allow a layered finesse, but of course the early return prevents the layered finesse connection too so I think it's all good. I added a test just to verify.

flackr commented 1 week ago

I think what's there now is okay, maybe a second condition like having 3+ cards that could match all with < 5 inferences would also be sufficient to decline, but I don't think it's worth doing too much math since this threshold is arbitrary anyways.

I think this is a definite improvement already. I have an idea based on the extra information added to the clue action in the clue stealing PR but I don't think we need to wait until I have time to try putting something together for this.

WillFlame14 commented 1 week ago

Thanks! :pray:

WillFlame14 / hanabi-bot

Avoid bluffing out cards that could duplicate our own. #309