WillFlame14 / hanabi-bot

A bot that plays on the hanab.live interface.
GNU General Public License v3.0
14 stars 9 forks source link

Doesn't recognize bluff possibility after fake finesse #315

Open flackr opened 6 days ago

flackr commented 6 days ago

Version (PM the bot with /version): v1.4.12 Convention settings: /setall 11 Steps to reproduce or replay link: https://hanab.live/replay/1205109#13 Additional information: robot1 clues the k2 to bluff out b1. robot2 assumes that it is a finesse, presumably thinking that the b1 bluff is not valid?

flackr commented 6 days ago

Ah the issue is two turns before, on turn 11 when the 2 clue is given to robot2, it considers a finesse on muffincake's b1 as a possibility. Then it assumes that the black clue cannot be a bluff if muffincake is already finessed. The assumption about being finessed is of course incorrect.

flackr commented 6 days ago

I put together a test case for this:

    it(`assumes a bluff when the bluffed player may have been finessed`, () => {
        const game = setup(HGroup, [
            ['xx', 'xx', 'xx', 'xx'],
            ['b3', 'y3', 'y2', 'g2'],
            ['r3', 'b3', 'p5', 'p4'],
            ['b1', 'r4', 'b2', 'y4']
        ], {
            level: { min: 11 },
            play_stacks: [1, 0, 0, 0, 0]
        });
        takeTurn(game, 'Alice clues 2 to Bob');
        takeTurn(game, 'Bob clues 2 to Alice (slot 3)');
        ExAsserts.cardHasInferences(game.common.thoughts[game.state.hands[PLAYER.ALICE][2].order], ['b2', 'r2']);

        takeTurn(game, 'Cathy clues green to Bob');

        // Since we don't know for sure that we have b2, we have to assume Donald may have been bluffed.
        // And we're not promised the g1.
        assert.equal(game.common.thoughts[game.state.hands[PLAYER.DONALD][0].order].finessed, true);
        ExAsserts.cardHasInferences(game.common.thoughts[game.state.hands[PLAYER.DONALD][0].order], ['r2', 'y1', 'b1', 'p1']);
        assert.equal(game.common.thoughts[game.state.hands[PLAYER.ALICE][0].order].finessed, false);
    });

The issue turns out to be much deeper than I thought. Because of the assumed finesse on the b1, we move Donald's finesse position to the r4 in find_finesse. The simpler explanation is certainly that the b1 was not finessed. As such, I think Alice should remove the b1 -> b2 connection and then could figure out a black finesse after playing the 2 and realizing that it's an r2.

flackr commented 5 days ago

As such, I think Alice should remove the b1 -> b2 connection and then could figure out a black finesse after playing the 2 and realizing that it's an r2.

Of course, everyone would need to understand Alice's confusion so as not to misinterpret the situation if Alice does have a b2 but needs to play the 2 first to know if they're finessed for the k1.

flackr commented 4 days ago

Here's 3 test cases that cover this situation from Alice's perspective for the two possibilities and from Bob's perspective for the delayed finesse possibility (since the bluff possibility is not tricky from Bob's perspective):

    it(`understands a bluff on an ambiguous false finesse`, () => {
        const game = setup(HGroup, [
            ['xx', 'xx', 'xx', 'xx'],
            ['b3', 'y3', 'y2', 'g2'],
            ['r3', 'b3', 'p5', 'p4'],
            ['b1', 'r4', 'b2', 'y4']
        ], {
            level: { min: 11 },
            play_stacks: [1, 0, 0, 0, 0],
            starting: PLAYER.BOB
        });
        takeTurn(game, 'Bob clues 2 to Alice (slot 3)');
        ExAsserts.cardHasInferences(game.common.thoughts[game.state.hands[PLAYER.ALICE][2].order], ['b2', 'r2']);

        // Cathy could be bluffing out b1 or finessing a g1 in our hand.
        takeTurn(game, 'Cathy clues green to Bob');

        // Donald could be playing into the b1 > b2 finesse or a g2 bluff from Cathy.
        takeTurn(game, 'Donald plays b1', 'r4');

        // Alice can't know whether the g1 > g2 finesse is real until she knows whether Donald was bluffed.
        // She should play the 2 first to find out.
        const action = take_action(game);
        ExAsserts.objHasProperties(action, {type: ACTION.PLAY, target: game.state.hands[PLAYER.ALICE][2].order});
        takeTurn(game, 'Alice plays r2 (slot 3)');

        // Recognizing that Donald played b1 for the bluff, Alice is not finessed.
        assert.equal(game.common.thoughts[game.state.hands[PLAYER.ALICE][0].order].finessed, false);
        ExAsserts.cardHasInferences(game.common.thoughts[game.state.hands[PLAYER.BOB][3].order], ['g2']);
    });

    it(`understands an ambiguous bluff delaying a finesse`, () => {
        const game = setup(HGroup, [
            ['xx', 'xx', 'xx', 'xx'],
            ['b3', 'y3', 'y2', 'g2'],
            ['r3', 'b3', 'p5', 'p4'],
            ['b1', 'r4', 'b2', 'y4']
        ], {
            level: { min: 11 },
            play_stacks: [1, 0, 0, 0, 0],
            starting: PLAYER.BOB
        });
        takeTurn(game, 'Bob clues 2 to Alice (slot 3)');
        ExAsserts.cardHasInferences(game.common.thoughts[game.state.hands[PLAYER.ALICE][2].order], ['b2', 'r2']);

        // Cathy could be bluffing out b1 or finessing a g1 in our hand.
        takeTurn(game, 'Cathy clues green to Bob');

        // Donald could be playing into the b1 > b2 finesse or a g2 bluff from Cathy.
        takeTurn(game, 'Donald plays b1', 'r4');

        // Alice can't know whether the g1 > g2 finesse is real until she knows whether Donald was bluffed.
        // She should play the 2 first to find out.
        const action = take_action(game);
        ExAsserts.objHasProperties(action, {type: ACTION.PLAY, target: game.state.hands[PLAYER.ALICE][2].order});
        takeTurn(game, 'Alice plays b2 (slot 3)');

        // Recognizing that Donald played b1 for the finesse, Alice is finessed.
        assert.equal(game.common.thoughts[game.state.hands[PLAYER.ALICE][0].order].finessed, true);
        ExAsserts.cardHasInferences(game.common.thoughts[game.state.hands[PLAYER.BOB][3].order], ['g1', 'g2']);
    });

    it(`understands delaying on an ambiguous bluff`, () => {
        const game = setup(HGroup, [
            ['g1', 'r5', 'b2', 'g3'],
            ['xx', 'xx', 'xx', 'xx'],
            ['r3', 'b3', 'p5', 'p4'],
            ['b1', 'r4', 'b2', 'y4']
        ], {
            level: { min: 11 },
            play_stacks: [1, 0, 0, 0, 0],
            starting: PLAYER.BOB
        });
        game.state.ourPlayerIndex = 1;
        takeTurn(game, 'Bob clues 2 to Alice');
        ExAsserts.cardHasInferences(game.common.thoughts[game.state.hands[PLAYER.ALICE][2].order], ['b2', 'r2']);

        // Cathy finessing out g1 in our hand.
        takeTurn(game, 'Cathy clues green to Bob (slot 4)');
        assert.equal(game.common.thoughts[game.state.hands[PLAYER.ALICE][0].order].finessed, true);
        ExAsserts.cardHasInferences(game.common.thoughts[game.state.hands[PLAYER.BOB][3].order], ['g1', 'g2']);

        takeTurn(game, 'Donald plays b1', 'r4');
        takeTurn(game, 'Alice plays b2', 'b4');

        // Alice is still finessed.
        assert.equal(game.common.thoughts[game.state.hands[PLAYER.ALICE][0].order].finessed, true);
        ExAsserts.cardHasInferences(game.common.thoughts[game.state.hands[PLAYER.BOB][3].order], ['g1', 'g2']);
    });

Edit: I removed the initial 2 clue since it's not necessary and makes the situation even more confusing and important to get right.

flackr commented 4 days ago

@WillFlame14 I'd be really curious to get your thoughts here as to the right way to fix the underlying issues. I think there are at least 4 things at play:

  1. We shouldn't move the finesse position for fake finesses. This is preventing us from finding the same card as a potential connection for a different real finesse (or bluff). It seems finesse cases are be taken care of by other logic so having a special bluff position which ignores finesses may work but I think in the long run finesses should account for this too.
  2. find_unknown_connecting can't assume "playable already clued" for cards that may not be finessed https://github.com/WillFlame14/hanabi-bot/blob/master/src/conventions/h-group/clue-interpretation/connecting-cards.js#L183
  3. We need some mechanism to delay the uncertain finesse in our own hand until our play which reveals whether it is real.
  4. Others need to recognize this delay in the finesse play as well.
WillFlame14 commented 1 day ago

This is complicated. I dislike this kind of confusing possibly-bluff overlapping with a possibly-finessed card, and it only occurs because of the ability to bluff. If everyone was telling the truth, then Alice would always be promised g1 regardless of the rb2 ambiguity.

In regards to 1, if the clue is to us then we can't tell if a finesse is real or not. We already don't write notes for symmetric/fake focus possibilities, so this is the only case where that could happen. I guess once bluffs are allowed we can no longer move the finesse position because of this, but I'm worried this will bring up all sorts of issues once multiple focus possibilities are allowed to be fulfilled using the same card. I don't really see another path forward though, I guess we'd just need to be careful.

2 is currently preventing us from cluing cards that might be finessed, so we would need to modify clue finding to explicitly disallow such clues if we change this.

One way to solve 3 and 4 is to create new waiting connections like [b1 finesse (Donald) -> b2 known (Alice) -> g1 finesse (Alice) -> g2] and [b1 bluff (Donald) -> r2 known (Alice) -> g2] after g2 is clued. This would make rb2 play urgent and remove the need to add extra defer logic, but these connections look kind of unnatural and might be difficult/expensive to construct.

Another way is to only use the existing [b1 finesse (Donald) -> b2] and [b1 bluff (Donald) -> g2] waiting connections and add extra logic for everything you mentioned, which seems like a pretty big undertaking and prone to bugs, since it's not used for anything else.

This is hard to think about :sweat_smile:, I'm not really sure the best way to do this. I'm leaning toward less overall change though, unless you think the new code would be helpful in other situations too.

flackr commented 8 hours ago

Thanks for your thoughts and pointers to what to watch out for. It sounds like I'm at least on the right track. The good news is that after Alice plays, the replay results in the correct state for Alice.

This is hard to think about 😅, I'm not really sure the best way to do this. I'm leaning toward less overall change though, unless you think the new code would be helpful in other situations too.

Agreed it's complicated, but I think the more we can get to a state where everyone has the same understanding of the state the better and easier it will be to handle in a manner consistent with human play.

For example, even without bluffs, Alice right now would assume that Cathy is busy and thus fail to save a critcial card if needed right?

TLDR, I think it'd be better if these fake finesse possibilities are properly tracked by everyone. I can look into how to do this in a way that's not overly complicated.