Convention/Framework Proposal: Dawn

pianoblook commented 3 years ago

We've had great success with these changes the past few months, and were recently encouraged to share this with the community. The following is a segment of this doc. So, figured I'd drop it on github for discussion!

Dawn

At the start of the game, cards on slot 1 are no more likely to be playable than on other slots. Therefore, we’ve developed this strategy to maximize the ability to efficiently “flush” playables out of players’ opening hands in such a way that doesn’t over-prioritize slot 1. We call this extra-early-game strategy “playing at Dawn”. We treat clues given at Dawn very similarly to those given in Duck variants. Specifically, we change some clue interpretations to fit with this chart.

Dawn extends for the first two rounds of any game: Turn 1 through Turn [2 * (player #)]

For example in a 4p game, Dawn lasts for the first eight turns.

The general rule is that clues that would require more than one blind-play from a single player are instead asking for something special. This really only impacts clues given to certain 3s and 4s:

3 Bluffs become 3 Discharges

UNLESS the 3 is actually just one away: then it’s a normal Bluff
4 Charms remain ON for two-away 4s

Unlike normal 4 Double Bluffs, Bob should ONLY play normal Finesse position if Cathy’s slot 1 is currently playable (instead of assuming he has exactly the matching card to make a 1-away card become playable")
Self 3 Bluffs still work as normal, and should be marked as any 3
Note that 3 Bluff and 4 Charm take precedence over UTD/UTC at Dawn
As long as it’s very clear what’s happening, certain Suboptimal clues can be used to promise a true double finesse.

For example, say Cathy holds three different 3s. On turn 1, Alice clues only ONE of the threes as Red, instead of simply rank cluing them all. Since Bob has no known 3s in his own hand that might be blocking good touch, Bob trusts he’s being Finessed for r1 + r2.

pianoblook commented 3 years ago

One nice side benefit we've adopted because of this is more clarity on 5-Pulls, sort of like what was discussed in #472 . The general consensus in that thread was that although dividing 5NDs into different responses made more sense logically, it was too much of a strategic loss to disallow more Discharge opportunities at the very start of the game...Well, this solves that!

(EDIT: for a more in-depth description of our 'Precision 5' framework, check here

Jayhui-q commented 3 years ago

This... is... fun! I really like these <3

Any particular reason for 2 rounds versus 1 or 3 rounds? Maybe that was play-tested and 2 felt the best?

Either way, this makes the early game so complicated, BUT I love it. Very cool stuff.

pianoblook commented 3 years ago

I wouldn't say complicated, but more just...super deep in terms of options and potential lines you can take. FWIW 3 discharges are no more complicated than 3 bluffs, imo.

As for 2 rounds, it feels like the best happy medium. To be fair we've only done extensive play-testing in 3player, though.

But I think the same reasoning holds true in 4p+:

1 round just isn't enough to give everyone enough opportunity to discharge/charm their Bob
3 rounds seems excessive though - I mean that's 12 turns of the game in 4p 😵 By then I'd argue it's become much more likely that Bob has something newly drawn than it is still in his discharge slot

piper0 commented 3 years ago

I'm gonna go ahead and strongly disagree that this makes the early game more complicated. 3 discharges, 4 charms, and trash blasts are all genuinely just as intuitive and easy to use as 3 bluffs, 4 charms (lol), and trash discharges.

The precision 5 tech is admittedly a bit more complicated, but that's not really the meat of this proposal.

When you have a ton of options for getting cards from honestly pretty much any slot, it definitely means choosing an opening line can be a bit trickier, but that's just because you have significantly more choices than you do without Dawn. The lines themselves are just as easy to execute, though – it just means that Alice has the luxury of considering many more lines than have historically been available to her. It's not more complicated. It's just... more.

Jayhui-q commented 3 years ago

Apologies, by complicated I meant to imply what Piano means by "deep." I don't mean that interpretability goes down. I don't think it's complicated in that sense.

I meant to use the word to suggest the complexity and intricateness of line construction. There becomes this interplay of what lines can one construct as to both utilize dawn tech and convert that into tech at noontime. i.e. For example, discharging out slot 3 and then 5ce'ing slot 2 rather than finessing and then discharging (because you might no longer be able to discharge at round 3, for example).

So yes... more.

pianoblook commented 3 years ago

I realized I didn't actually put in any specific in-game examples here, so here's a dump from the past few weeks:

3D https://hanab.live/replay/468003#5 https://hanab.live/replay/466827#4 https://hanab.live/replay/463221#1 https://hanab.live/replay/465201#2 https://hanab.live/replay/465293#3 https://hanab.live/replay/463177#1 https://hanab.live/replay/450520#3 https://hanab.live/replay/450659#1 https://hanab.live/replay/450497#1

4C (over double bluff) https://hanab.live/replay/450626#2 https://hanab.live/replay/450474#2

And for fun, here's a taste of a 4Charm Mirage https://hanab.live/replay/465236#1

sjdrodge commented 3 years ago

I've been playing games recently w/ Dawn on, and my opinion is that the changes are magnificent for variants where it's difficult to clue cards (Null being the quintessential example), and it's pretty much a wash for No Variant. So I wanna give this proposal a huge thumbs up. It's very simple and powerful and I've been loving it so far.

This is slightly out of scope for this proposal, but KiPiPi have a few additional Null-specific conventions coming down the pipeline, and I firmly believe once Dawn + those conventions are accepted, we can get rid of positional clues altogether, which in my mind is an enormous win.

dobi815 commented 3 years ago

There is a lot to parse in this proposal - I think it might be better as a bunch of smaller proposals to evaluate better. I am mostly interested in the 3 bluffs -> 3 discharges piece and would love to have it be a part of the game in low score phase.

Dawn extends for the first two rounds of any game: Turn 1 through Turn [2 * (player #)] For example in a 4p game, Dawn lasts for the first eight turns.

What about extending this to low score phase or early game? Is that too long? The introduction of a new phase, set exactly at first 2 rounds, seems a bit too short. Considering that dawn clues are not delayed, this really just limits to playing #-players cards with this convention.

3 Bluffs become 3 Discharges
UNLESS the 3 is actually just one away: then it’s a normal Bluff 1. 4 Charms remain ON for two-away 4s
Self 3 Bluffs still work as normal, and should be marked as any 3

Why not anything two-away vs. 3s and treating two away 4s different? Any examples from trying this in U&D?

piper0 commented 3 years ago

What about extending this to low score phase or early game? Is that too long?

We've play-tested a ton of different versions of this over the last few months, and two rounds seems to be a pretty perfect length. It's usually enough time to get all the discharge/charm cards played, and after touching a bunch of cards to do this, you'll frequently wind up with first finesse on slot 2 or 3 anyway. More importantly, 2 rounds is enough time to draw a lot of cards, at which point Dawn becomes a lot less exciting. Remember that the primary purpose of Dawn is to not overemphasize first finesse at a time when first finesse isn't inherently interesting. Once folks start drawing, slot 1 becomes much more appealing.

That said, I definitely encourage you to science some different lengths of Dawn!

waweiwoowu commented 3 years ago

Nice framework. And I'm happy to see people playing with 3 Discharges and 4 Charms (which only requires two blind-plays) outside of the variants without positive information.

rz-1 commented 3 years ago

Under Dawn, would a clue focusing the 3 of a suit but also touching the 2 be a 3 discharge or 1 away 3 bluff? ie: Turn 1 Alice clues red to Cathy B: x x x x x C: [r]3 [r]2 x x x Does Bob play s1 or discharge?

pianoblook commented 3 years ago

Under Dawn, would a clue focusing the 3 of a suit but also touching the 2 be a 3 discharge or 1 away 3 bluff?

We've been playing it as a Discharge - there are a couple ingame examples of that in the replays I posted above too

rz-1 commented 3 years ago

We've been playing it as a Discharge - there are a couple ingame examples of that in the replays I posted above too

Doesn't this conflict with UDD? How does Cathy know the focus of the clue is a 3 or trash?

pianoblook commented 3 years ago

yes, 3D takes precedence at dawn over any UTD at dawn.

in 3p this is solved with a Blast, but that's not really part of the main proposal

Zamiell commented 3 years ago

remembering different sets of rules for different situations is annoying and complicated. (dawn --> non dawn, early game --> mid game, low score phase --> mid score phase) thus, i dont think this is a good fit as a general convention. why not just have it as a null specific convention as a replacement for null positional clues, or something along those lines?

sjdrodge commented 3 years ago

remembering different sets of rules for different situations is annoying and complicated. (dawn, non dawn, early game, mid game, low score phase, mid score phase) thus, i dont think this is a good fit as a general convention. why not just have it as a null specific convention as a replacement for null positional clues, or something along those lines?

I fully support this. Though I do think Dawn is nice for a few other variants where cluing specific cards is hard (like Omni, say). But it definitely makes most sense as a variant-specific solution, and we really really don't need it in NV games.

pianoblook commented 3 years ago

Sure, whatever seems best for general conventions is fine.

For the record, from the several months' experience of using it now I'm quite convinced it's an improvement in all variants and, seemingly, all player count. But there's no denying that it's an increase in overall game complexity, so RIP

jakestiles commented 3 years ago

Agree with Zam and Stephen’s posts.

sephirothx commented 3 years ago

https://github.com/hanabi/hanabi.github.io/blob/main/misc/hat-guessing.md

pianoblook commented 3 years ago

https://github.com/hanabi/hanabi.github.io/blob/main/misc/hat-guessing.md

None of this is any more 'hat guessing' than any other convention that gets a different response based on the 2+ away criteria.

The crux of this change comes down to, "turn 3 bluffs into Discharges in the first two rounds of the game". That has nothing to do with hat guessing. Maybe you didn't really read the proposal?

Zamiell commented 3 years ago

i would like to play 3-5 null games with this before commenting on whether or not it is good and/or accepting it

in the meantime can you submit a PR, or paste the relevant new sections here in markdown format

pianoblook commented 3 years ago

To be very clear, this proposal is totally removed from Null - we play this way in all games, no variant included.

I'm realizing that maybe it was confusing publishing this in the same doc as non-positional Null. Although we think Dawn helps for playing Null, Dawn & non-positional Null are two independent proposals

Zamiell commented 3 years ago

right, but so far, it sounds like the general case of dawn isn't wanted.

to keep things more clear, lets open a new issue for dawn in null specifically, if that's something you want. include in the issue the relevant sections that will be copy pasted into the null page, as well as any sections that will be deleted, if any

pianoblook commented 3 years ago

riperonis

sjdrodge commented 3 years ago

Can we reopen this issue? A lot of us have been playing with Dawn on all the time, and as a result my opinion has changed. I now suspect that Dawn is better even in NV.

Let me explain more fully. When it comes to maximizing win-rate in low efficiency variants, there are two major areas where I think our current conventions are lacking:

discard quality
a lack of legal stall clues in the late mid-game

Dawn does a tremendous job of addressing the first issue.

While it's true that 4's are often the best non-trash discard, on a large percentage of deals it is possible (with the right conventions) to discard exclusively trash. In order to accomplish this, the team will often have to touch all the non-trash cards in the starting hands. Our conventions are very good at touching most of the 3's, but currently we have trouble touching the 4's reliably. Our main ways are just via ancillary touches from other clues and via Double Bluffs. Adding more ways to touch 4's at the very beginning of the game is a great extra tool to have in the toolbelt.

It's also much easier to touch all the useful cards in the starting hands when you can reliably give high efficiency clues. At the very beginning of the game, the usual bias towards playables being in slot 1 doesn't apply. Therefore, it makes a lot of sense to have fewer ways to get cards from slot 1 and more ways to get them from slot 3 and 4. The result is more reliable X-for-1's.

pianoblook commented 3 years ago

It seems like there's a strong consensus among those who have tried it more than a couple times (except for Kakashi, lol) that turning on Dawn is just better. To add on to all the great points Stephen made, there are two other big advantages I'd like to mention:

1) More chances to efficiently spread out available play signals, thus preventing early extinctions of available clues.

Normally, if Bob has ≥two 1's in his hand, Alice will often feel obligated to give the clean x-for-1 1's clue (unless they're lucky and one of them is on slot 1). This is fine at first, but it accelerates the end of Early Game: it extinguishes two or more potential play signals with one clue.
Even without Dawn, I'm fairly convinced that declining to give a 2-for-1 on 1's, and instead breaking it into two 1-for-1s, is probably optimal a decent amount of the time - even if it's really just your chop you might be meaningfully protecting. Adopting Dawn really empowers this sort of segmentation by giving efficient opportunities for play signals, without rushing to end early game.

2) Much more flexibility in selecting opening lines, allowing better teamwork and helping protect useful cards down the line.

Stephen touched on some of this already: more opportunities to touch 3s and 4s at the start is already pretty huge.
Going further, Dawn also leads to more options overall, which leads to more valuable decision-making, and in turn better outcomes. It's very common in Dawn openers to be presented with a lot of possibilities (people have even mentioned it can feel overwhelming). Very often I find myself needing to compare multiples lines several rounds into the future to determine which is best. Obviously that happens plenty of times outside of Dawn too, but it's much more common now. Actually the opening hands of the last game I played is too good not to share for this point: https://hanab.live/replay/607724#1. I just remember being excited as Cathy on t1 to know that even with such an awkward deal to piper, kimbi would surely be swimming in options and probably deciding between a potential 5ce, 3d, or 4c.

pianoblook commented 3 years ago

oh there are two more big benefits that haven't been mentioned yet:

1) There's a lot less ambiguity about the identity of the focused card.

If a color clue gets a slot 1 play, then it's no longer "23": it's always just a 2.
If a 3 clue to Cathy+ gets a slot 1 play, then usually it's exactly a 1-away 3 (usually immediately identifying it).
Alternatively, this can set up easy true finesses if no ungotten 3s are 1-away. Essentially 'hard 3 bluffs' get turned off.

2) As a nice side-benefit of having a reliable early-game Discharge tool, 5 Tech can be restructured to be much more logically consistent + informative:

A 5 Pull on a 2+away card = Ejection (Play --> Save), thus instantly Chop Moving the card.
A 5 Pull on a trash card = Discharge (Play --> Trash), thus marking it as kt and further improving discard quality.

pianoblook commented 3 years ago

I won't even get into how fun and powerful Dawn Shadows+Mirages are for hard variants (I would propose they shouldn't apply in normal variants, since their existence does slightly restrict Alice's options on delicate board states)

sjdrodge commented 2 years ago

@Zamiell This has been sitting around for a while, and it seems we aren't going to get any more unsolicited opinions. If you're still uncomfortable w/ accepting it, maybe give it a try in a few games or directly poll players whose opinion you wish to hear.

Zamiell commented 2 years ago

it sounds like dawn is "good". but the primary issue with dawn doesn't have much to do with whether or not it is "good", or that playing with it will lead to better scores. the primary issue is that it is annoying for players to remember three different "phases" at once:

1) early game --> mid game 2) low score phase --> mid score phase 3) dawn --> non dawn

the idea of the hyphenated conventions is that we get as much value as we can without going over some arbitrary complexity threshold, and this convention sounds like it might push us over the arbitrary threshold.

we want to make sure that even at the highest level, its not super annoying to remember all of the things for players who are not dedicating their entire lives to hanabi. so sometimes that means rejecting "good" conventions

Zamiell commented 2 years ago

/stale

conventions-bot[bot] commented 2 years ago

Some time has passed since this issue was opened and the discussion appears to have died down.
💤 Either the document has already been updated or no additional changes need to be made.
This issue will now be closed. If you feel this was an error, feel free to continue the discussion and a moderator will re-open the issue.

(For more information on how consensus is determined, please read the Convention Changes document.)

aliblong commented 2 years ago

I've played with dawn, and I can echo the sentiments about it being powerful and not complicated. Count my vote in support, bringing the total to 6 yea and 1 nay.

Stephen's last suggestion to you, @Zamiell , was "to give it a try in a few games or directly poll players whose opinion you wish to hear." Given the support tally, I'll reiterate this point -- the ball is in your court to seek more support for your own position, or at the very least make it clear why you think such a lopsided vote count isn't enough to ratify a new convention, and also to clarify what it would take to get this across the finish line.

One last comment: I support this being an opt-in convention, if that were to be a section in the doc.

mmelwen commented 2 years ago

I also am in favor of dawn, in the first two rounds usually you don't find trash for discharges and also I love 3's so is a good way to combine keeping 3's and play 3rd finesse position

aliblong commented 2 years ago

Oh, and as others have already done, I oppose this assessment:

this convention sounds like it might push us over the arbitrary threshold

It's really just not complicated at all; the Dawn phase is strictly defined, and gameplay during that time actually uses relatively fewer conventions than normal, due to 3 discharges and 4 charms superseding all the normal ways of doing discharges and charms.

And regarding this:

we want to make sure that even at the highest level, its not super annoying to remember all of the things for players who are not dedicating their entire lives to hanabi. so sometimes that means rejecting "good" conventions

With "good" in quotation marks, I'm interpreting your argument as simply being that you think this convention has a low benefit-to-complexity cost. While I've already argued that the complexity is not high and that the benefit is high, I also want to stress that "super annoying to remember", i.e. your idea of "headspace convention" doesn't accurately describe something as foundational as Dawn. It would either either something the group explicitly opts into, meaning it's top of mind, or something that is used in every single game, which means you'll acclimate to it after just a few games.

Jayhui-q commented 2 years ago

Chiming in here again with my support on this. This part on its own is really convincing to me:

At the start of the game, cards on slot 1 are no more likely to be playable than on other slots.

I also don’t think this is more complicated than learning 4 charms in a general game.

Some things are odd to me personally, though, such as:

Self 3 Bluffs still work as normal, and should be marked as any 3

And I haven’t gotten enough play-testing in to verify whether 1 round of play is worse than 2 rounds of play for this. My gut tells me that Dawn is much better on the first round, but its benefits drop substantially by the second, especially for games with more than 3 players.

Nevertheless, I think it should be a smooth addition to make it an “optional” or “opt-in” convention. In my opinion, this would really facilitate more play testing and we can get a better sense of where it belongs and how to optimize its features.

pianoblook commented 2 years ago

Just want to respond to @Jayhui-q 's two comments - I totally get why those two decisions may feel strange on paper, but here's what we've found so far, at least:

Self 3-Bluffs: This is mostly a flexibility thing, plus I can't envision a stronger alternative. Just as in non-Dawn, it can be pretty easy to construct a line where the team tees up a 3 in front of a 1 in your Bob's hand. Self 3-Discharges are also pretty low value when you could instead 4Charm (or just chop-focus). We've found this to be a nice compromise, though I there is nothing necessary about their inclusion in the system.
2 Rounds: This one I feel quite strongly about, after playing 500+ games with it. The simplest way to put it is that it helps ensure that each teammate gets a chance to give a Dawn clue to their Bob. e.g. if Alice starts with a 3D to Bob, but then Dawn is over by the time it's Alice's 2nd turn, Bob will never have had a chance to pull off his own sweet Dawn clue to Cathy. In my experience Dawn in round 2 is just as powerful as in round 1, since in the 1st round your teammates can also help craft lines to set each other up for follow-up 3Ds.

Jayhui-q commented 2 years ago

@pianoblook thanks for the quick response!

Self 3-Discharges are also pretty low value when you could instead 4Charm (or just chop-focus).

Great point. I guess there would be moments in other variants where inability to chop-focus and/or being able to blast when 2 cards are touched are more relevant. Anyway, now that I think about it more, you’ve convinced me about the benefits of having this generally.

2 Rounds: This one I feel quite strongly about, after playing 500+ games with it.

Could you quickly clarify if this conclusion also applies to 4 or more players? I am convinced personally by 2 rounds in 3 players, and just haven’t had the games in with 4 or more to feel it out. If a decent chunk of those 500+ tested games are 4 or more, I’m sold on the 2 round rule.

Romain672 commented 2 years ago

One of the other question is what you want to do with it if it's added?

added immediately (level 12)
added at the end (level 23)
added in extras
added at optionnal in the extra as an option which can be turned on/off at the start of every game

Romain672 commented 2 years ago

I personnally feel like this convention start to enter on the field where you do differents things depending of your number of cards not chop moved, but it's done artificially with that '2 rounds' thing.

pianoblook commented 2 years ago

I personnally feel like this convention start to enter on the field where you do differents things depending of your number of cards not chop moved, but it's done artificially with that '2 rounds' thing.

If it helps it feel less artificial, just think of it this way: _Dawn clues are a special way to help get cards out of people's opening hands. There's nothing artificial about the fact that cards in players' opening hands are all equally likely to be playable at the start of the game. And uncoincidentally, all Dawn clues that would be given in the first 2 rounds will always only get cards from opening hands.

Could you quickly clarify if this conclusion also applies to 4 or more players?

I wish I had some way to quantify the number of 4p+ Dawn games I've done; maybe 30-70? Anyway, I strongly urge folks to start with 2 rounds. It really is a big loss to not let everyone have a decent chance to give a Dawn clue. If there was only one round of it then suddenly I as Alice would start needing to weigh whether it's worth me 3D'ing my Bob if it will instantly & permanently erase his ability to follow-up with a 3D/4C of his own to Cathy.

One round would certainly be better than none, but two just feels super good. I don't really see what the player count would have to do with this, actually

Jayhui-q commented 2 years ago

@pianoblook thanks for the clarity!

Romain672 commented 2 years ago

Another question, if you'd merge early game with first 2 turns, that would mean after 2 full turns, 5-stall are off, and everyone has the right to discard.

I know it's not great, but that's another option.

(with that idea, maybe 2full turns+1turn is best, to get three chances for players to save a 2/5 in the first player's hand)

Zamiell commented 2 years ago

since libster seems to think that the complexity is low, i will reopen this issue and accept a pr for it.

i will say though that there seems to be a selection bias in GitHub-style voting.

for example, if you wanted to find out the answer to the question of "What percentage of Americans would say that pizza is their favorite food?", and then you proceeded to poll people at a local pizza restaurant, then you would probably get a different result than if you polled people in a more neutral setting. by choosing a pizza restaurant, you would be selecting for people who probably already like pizza, and this is going to skew the results that you get.

in the same way, only the most hardcore Hanabi enthusiasts have the time or inclination to read and discuss Hanabi conventions on GitHub. so this style of voting is selecting for the types of people who are going to think that "dawn is not that complicated", and push it forward. the overwhelming majority of people who join the discord server and play pickup-games with the Hyphen-ated conventions are not as good at the game or as intelligent as Libster. so when I am thinking about the complexity cost of a particular convention, i'm not particularly weighing Libster's analysis very highly, especially given that he doesn't participate much (or at all?) in non-competition games.

i think it's important and healthy for everyone voting on conventions to play pick-up games in the Discord server with a wide range of people from all skill levels, and not play in your own insular-competition groups. doing this kind of thing helps correct the bias when it comes time for thinking about what types of conventions we should accept and what kinds of conventions we should reject.

pianoblook commented 2 years ago

I do feel obligated to mention that you must realize that there's some irony in you saying all that - there have been many convention changes that have been made despite there not being proven broad support (and the opposite as well).

But that being said - and despite Dawn's proven benefits - I hear what you're saying, and actually agree. I'll copy from a previous discussion I had about it fairly recently:

When it comes to Dawn, tbh I think it would be a weird change to make without a lot more community support/awareness of it. It's a pretty huge shift gameplay-wise, and frankly I wouldn't even want to pass it without more broad community input.

So perhaps a few options: 1) Just turn it on and let people learn it. I wasn't around when 3 bluffs got turned on, but I would imagine this change would be similar in cognitive impact in the short-term. 2) Somehow do a wider polling/discussion on the main Discord. If there's reasonable blowback then it doesn't sound very fun to turn it on officially. 3) Potentially make it some sort of opt-in feature? I'm imagining an optional 🌅 emoji or something in table select, I dunno.

Zamiell commented 2 years ago

there have been many convention changes that have been made despite there not being proven broad support

the point of the analogy was not to say that "conventions should have broad support". rather, it was intended to show that "expert players who dedicate their lives to hanabi are going to have different opinions on how much complexity is manageable than other people", which of course is related, but I think quite distinct.

Just turn it on and let people learn it. I wasn't around when 3 bluffs got turned on, but I would imagine this change would be similar in cognitive impact in the short-term. Somehow do a wider polling/discussion on the main Discord. If there's reasonable blowback then it doesn't sound very fun to turn it on officially. Potentially make it some sort of opt-in feature? I'm imagining an optional 🌅 emoji or something in table select, I dunno.

absolutely not to 2 or 3, but whoever does a pr should put it at level 24

waweiwoowu commented 2 years ago

Agree with what Zamiel has mentioned and I understand why some of my proposals (and piano's) got rejected. But for dawn, I think they are too good to be turned on despite the fact that they are a little bit complicated. I bombed twice in the games with dawn on, forgetting they only apply in two rounds. But I think it's just like the process how we learn new conventions.

I'm happy this issue got reopened again and I'm looking forward to seeing it becoming officially approved.

Dr-Kakashi commented 2 years ago

The way I'm analyzing the games: I'm counting the number of clues, cards played, and cards remaining touched in player's hands. This allows us to calculate efficiency. Then I look to see which lines have a better setup to continue the game.

I used the games piano posted, as they contain dawn moves and should be the best examples of dawn.

Game 1 2022-01-06_14-17-48 (dawn1a) Above image is Dawn Line of a random game that we found dawn moves in.

2022-01-06_14-17-48 (dawn1b) Above image is Hyphenated Line

Dawn T1 - Ace gives red to Kakashi, 3 for 1 bluff, R2 is now immediately known T3 - Kakashi gives 3 to micerang, 2 for 1 discharge, 3 is now known to not be black T5 - Mice gives ace yellow, Normal 3 for 1 play clue on Y1 T6 - Kakashi gives red to mice as a 1 for 1 4 Clues, 2 cards played, 7 cards gotten, 2.25 efficiency BDR = 0

Hyphenated T1 - Ace gives red to Kakashi, Kakashi doesn't know if it's R2 or R3 T3 - Kakashi gives Ace Yellow as a normal 3 for 1 play clue on Y1 T5 - Mice gives ace black as a 5CE 3 clues, 2 cards played 6 cards gotten, better setup to continue the game, 2.67 efficiency BDR = 0

Hyphenated spent 1 clue less and is able to continue the game in a better position.

Dawn 0 - Hyphenated 1

Game 2 2022-01-06_14-45-41 (dawn2) Above image is Dawn Line

2022-01-06_14-45-41 (dawn2B) Above image is Hyphenated line 2

Dawn - 4 clues, 4 cards played 5 cards touched, 2.25 efficiency. 1 Trash card is picked up. Only g2 on finesse position to continue BDR = 0 ~~Hyphenated - 4 clues 4 cards played 4 cards touched, 2.00 efficiency. setup for 3 for 1 to continue~~ Hyphenated line 2 - 5 clues, 5 cards played 6 cards touched, 2.2 efficiency. 1 trash is picked up, which becomes known y2 when stephen discards. Piper has y3 for continuation. Kimbi needs a fix on o2 or will play it as o1. BDR = 0

Dawn has 1 extra card touched, so has higher efficiency by 0.05. Unfortunately, piano and kimbi has 1 trash card that is unknown. Hyphenated line, kimbi needs a fix. Arguably, with such a small difference in efficiency with similar board states, this game should be a wash. However, since dawn is still "marginally" better I'll give Dawn a win on this one.

Dawn 1 - Hyphenated 1

Game 3 2022-01-06_14-45-41 (dawn3a) Above image is Dawn Line

2022-01-06_14-45-41 (dawn3b) Above image is Hyphenated Line

Dawn - 6 clues, 4 cards played and 8 cards touched, 2.0 efficiency BDR = 0 Hyphenated - 6 clues, 4 played, 9 cards touched, 2.16 efficiency. Team has 1 known trash card touched. BDR = 0

Hyphenated line has more cards played and more cards touched with the same number of clues.

Dawn 1 - Hyphenated 2

Game 4 2022-01-06_15-28-19 (dawn4b) Above image is the dawn line

2022-01-06_15-27-50 (dawn4) Above image is the hyphenated line

Dawn - 4 clues, 2 cards played, 5 cards touched, 1.75 efficiency, 1 for 1 on y2 to continue BDR = 0 Hyphenated - 3 clues, 2 cards played, 6 cards touched, 2.67 efficiency, 1 for 1 purple (baton p4) and 2 for 1 yellow to continue game BDR = 0

Hyphenated spent 1 clue less and has more cards touched, while also having 2, 2 for 1's to continue the game.

Dawn 1 - Hyphenated 3

Game 5 2022-01-06_15-56-54 (Dawn 5)

This game is a wash, no dawn moves were used. Piano should've opened with self 3 bluff in the beginning as a 3 for 1 that works in both Dawn and Hyphenated. The mistake of doing a 1 for 1 for the start, allowed for a dawn move to be available.

Dawn 1 - Hyphenated 3

Game 6 2022-01-06_15-56-54 (Dawn 6b) Above image is Dawn Line

2022-01-06_15-56-54 (Dawn 6d) Above image is Hyphenated Line I wanted to double-check that Hyphenated didn't have any continuation, so had to replay it. I found a better hyphenated line

Dawn - 4 clues, 5 cards played, 6 cards touched, 2.75 efficiency, continuation is a 2 for 1 by finessing g1 BDR = 1 Hyphenated line - 5 clues, 5 cards played, 7 cards touched + 1 known trash, 2.4 efficiency Continuation is 2 for 1 reds to kimbi and Null 2 is on finesse position. BDR = 0

Hyphenated line have lower efficiency by 0.35, but has a BDR of 0. BTW this is a null game earlier comments on this thread said dawn would be stronger in null. It makes sense since you're able to touch brand new cards to get different slots. Whereas in hyphenated you generally need to re-clue already touched cards to signal a position.

Dawn 2 - Hyphenated 3

Game 7 2022-01-06_15-56-54 (Dawn 7) Above image is Dawn line

2022-01-06_15-56-54 (Dawn 7b) Above image is Hyphenated line

Dawn - 3 clues, 3 cards played, 6 cards gotten, 3.0 efficiency BDR = 0 Hyphenated - 3 clues, 3 cards played, 5 cards gotten, 2.67 efficiency BDR = 0

Dawn 3 - Hyphenated 3

Game 8 2022-01-06_15-56-54 (Dawn 8a) Above image is Dawn Line

2022-01-06_15-56-54 (Dawn 8b) Above image is Hyphenated Line, image is on T8 because I had R1 play, so R1 is touched not played

Dawn - 3 clues, 3 cards played, 6 cards gotten, 3.0 efficiency. Continuation: Piano must save piper with 1 for 1. Piper then gives yellow to kimbi as a 2 for 1. BDR = 0 Hyphenated Line 2 - 3 clues, 3 cards played, 6 cards gotten, 3.0 efficiency, 3 for 1 to continue with 3's to kimbi BDR = 0

Equal efficiency, however, hyphenated has the better continuation with a more efficient clue.

Dawn 3 - Hyphenated 4

Game 9 2022-01-06_15-56-54 (Dawn 9a) Above image is Dawn line

2022-01-06_15-56-54 (Dawn 9b) Above image is hyphenated line

Oh, this is an interesting game. Piper doesn't do his blind play to initiate a 5 double pull. I had piper do this in both dawn and hyphenated line. I did the line where both teams stopped the bomb.

Dawn Line - 5 clues 1 card played, 9 cards gotten + 1 trash, 2.00 efficiency BDR = 0 Hyphenated line 4 clues, 1 card played, 8 cards touched (g1 in piano's hand is gotten from 5 double pull, i4 on kimbi is saved with should from piper), 2.25 efficiency BDR = 0

Dawn 3 - Hyphenated 5

2022-01-06_15-56-54 (Dawn 11) Above image is the 4C example

I realized that you are able to still do the same clues in both Dawn and Hyphenated, so this is a wash. This was probably given as an example before 4 charms were modified.

Dawn 3 - Hyphenated 5

Hyphenated is the clear winner. 2 of Dawn's wins were in Null.

Dr-Kakashi commented 2 years ago

I've played all the games piano posted in this thread and analyzed them comparing dawn lines to Hyphenated lines.

Over all Dawn was worse off, unless when Null variant was done. I'm still hesitant to say that Dawn is marginally better in null as we don't have enough samples.

For the point that Dawn "significantly" put the team in a better position to continue, I didn't find that to be the case. In most of the Hyphenated lines, you can see that a Hyphenated crew would be in a better position to continue the game.

For the argument that we are able to touch 4's more often in Dawn, I didn't find that to be the case, as you can still do charms in the hyphenated line. The argument that you can do 4 double bluffs on 2 away 4s just doesn't hold because the window is so short. In the short window, a 2 away 4 must be available in addition to the next 2 players having blind plays. There just aren't that many cases of that. Even if you did find the cases, how often do they occur?

For all the other cases where you can touch 4's in dawn, you can do the same in the Hyphenated line. That's why on the 4C examples it's equal to hyphenated.

In theory, I understand the points of using 3's as discharges, while being able to immediately tell the clue receiver that they're holding a 2 or 3 (that's two away). The whole dawn theory hinges on every slot in a player's starting has an equal probability of being playable. However, it doesn't hold, once the first play/discard happens, which does happen during the 2 turn window, as it's much more likely that they have now drawn a playable on slot 1.

It's also consistent to just use hyphenated conventions throughout the entire game. It's odd to need to switch between convention sets (deleting 3 bluffs changes the hyphenated conventions). Even if we say Dawn is "marginally" better is it really worth adding an extremely short phase with a set of conventions (the sum of the complexity of those conventions). I find the case to be no.

Overall, hyphenated is able to touch more cards, in addition to having better continuation lines.

Dr-Kakashi commented 2 years ago

Another thing of note that I don't see being mentioned.

Let's take all the games of Hanabi in Existence.

There will be games where there are no dawn moves available. That is a substantial hit to Dawn because the whole phase and time to learn how to play dawn and memorize the conventions wouldn't be used.

I want to see unbiased proof/data. We are talking about adding in a whole phase for players to be aware of here. Ideally, we would have 100 random samples. Know the ratio of games that have dawn or not (this would let us decide if we should even add a phase or not).

Then it needs to be taken further: of the games where there are Dawn moves available, How many games would be better in Hyphenated line vs Dawn line? It isn't good enough if Dawn is equal to Hyphenated. It must be clear that Dawn is superior.

With a sample size of 8 games, that were specifically chosen to show how good dawn is. Dawn is better 37.5% of the time Hyphenated is better 62.5% of the time

Unfortunately, with the examples given, Hyphenated is just better.

hanabi / hanabi.github.io

Convention/Framework Proposal: Dawn #589

Dawn