Wrong guess score with multiple copies of a letter

gpiancastelli commented 2 years ago

There's a bug in your score_guess function. If the guess contains two copies of a letter, and that letter is present only once in the answer, and the second copy in the guess matches that letter in the answer, the first copy will be marked as WRONG_PLACE, while the second copy will be marked as NO.

I'm not an English native speaker, so I can't come up with a real example just right now. But let's say we have 'A__A_' as our guess and '___A_' as the answer (let's just ignore letters marked as _, which we presume are all different from A). Your score_guess function will return '🟨__⬜_' instead of '⬜__🟩_'.

OverkillGuy commented 2 years ago

Hey, thanks for the bug report!

I grepped for answer/guess words that match this format, to feed as examples. Found this:

secret answer: uncut (repeating u)
guess: zebus (only one u, at second place)

This seems to match your definition. I've created these as a test case in a branch of the repo (not using the fancy literate programming tools), but I can't seem to reproduce your report of score 🟨__⬜_, instead I do get this: ⬜__🟩_?

Curious to investigate this further!

OverkillGuy commented 2 years ago

Oh no, I misunderstood your question, you have the reverse of mine (I swapped answer and guess). I'll pick words again, sorry!

OverkillGuy commented 2 years ago

You are right indeed! See proper commit, using:

Guess: xenon (two n) Answer: train (one n, at the end)

Score should be ⬜⬜⬜⬜🟩, actually is ⬜⬜🟨⬜⬜!

Bug confirmed! Thank you! I'll see what I can do (any ideas?)

OverkillGuy commented 2 years ago

Okay, I've got the draft of a solution which passes tests again in #2.

I don't like the look of the fix (changing response to a dict[int, str] is ugly) but this is a start. I'll revisit later.

@gpiancastelli I'd be happy to hear your feedback on this fix.

Of course, once the fix is ready on the code side, I'll have to make an update to the actual "story" that explains our change, and I'll be sure to credit this finding to you.

gpiancastelli commented 2 years ago

I wrote a possible solution yesterday but I need to see if I can change one or two things that I don't like. It's both similar (two passes, still uses a Counter) and different (the Counter is not used for the entire answer) from yours. Hope to post something in the next few hours.

gpiancastelli commented 2 years ago

So here's my take:

def score_guess(guess, answer):
    result = [CharacterScore.NO] * len(answer)

    rem_answer = Counter() # unmatched letters in answer
    rem_guess = [] # indexes of unmatched letters in guess

    # check for exact matches
    for i, (a, g) in enumerate(zip(answer, guess)):
        if a == g:
            result[i] = CharacterScore.OK
        else:
            rem_answer[a] += 1
            rem_guess.append(i)

    # check for letters in other places
    for i in rem_guess:
        letter = guess[i]
        if letter in rem_answer:
            rem_answer[letter] -= 1
            result[i] = CharacterScore.WRONG_PLACE
        if rem_answer[letter] == 0:
            del rem_answer[letter]

    return ''.join(result)

Unfortunately, I haven't been able to imagine a different algorithm that would eliminate the need for two loops. But the result is a plain list. The Counter is needed only for the letters remaining after the check for exact matches.

I have worked off-project (i.e. in a separate script file) so of course please double check that all your tests are passing. Sorry about that, but I believe I haven't touched Python seriously in the last year and a half.

OverkillGuy commented 2 years ago

Ooh, thanks.

As mentioned above, I wanted to make another pass to remove that dict[int, str] for response, and was idly wondering how I could use a list[str], maybe even defaulting to CharacterScore.NO, and you showcased that very elegantly!

I've run your code as-is in the project, and it does absolutely pass all tests! Separately, I'm curious if you want to try the project's structure.

Interesting approach, iterating on only the unmatched letters! It means another datastructure or two, but it gives a more specific iteration, that's cool. I was trying to remove my uses of del in existing code, as it is an unusual part of Python I don't want to encourage, and it felt less pretty than the rest. I also may not want to go for the different datastructure, keeping it simpler, at the cost of skipping over NO case explicitly, which feels like acceptable tradeoff for less datastructures to juggle.

Thanks to your code as inspiration, I now have the following, which feels like a nice balance of short + elegant, while keeping in the spirit of the existing (if buggy) code:

def score_guess(guess: str, answer: str) -> str:
    """Score an individual guess with Counter"""
    # Counter("abbey") = Counter({'b': 2, 'a': 1, 'e': 1, 'y': 1})
    answer_chars = Counter(answer)
    # NO is the default score, no need to detect it explicitly
    response: list[str] = [CharacterScore.NO] * len(answer)
    # First pass to detect perfect scores
    for char_index, (answer_char, guess_char) in enumerate(zip(guess, answer)):
        if answer_char == guess_char:
            response[char_index] = CharacterScore.OK
            answer_chars[guess_char] -= 1
    # Second pass for the yellows
    for char_num, (guess_char, existing_score) in enumerate(zip(guess, response)):
        if existing_score == CharacterScore.OK:
            continue  # It's already green: skip
        if guess_char in answer_chars and answer_chars[guess_char] > 0:
            response[char_num] = CharacterScore.WRONG_PLACE
            # Reduce occurence counter since we "used" this occurence
            answer_chars[guess_char] -= 1
    return "".join(response)

Passes all tests, and is much nicer than my previous code. Will let it simmer for the rest of the weekend, and then I'll look into writing an addendum to the narrative, explaining the bug, the fix, and reflecting on the reason the bug wasn't caught during original development.

gpiancastelli commented 2 years ago

Counter is both a blessing and a curse. It does everything we need, but in a weird way which does not exactly correspond to an ideal bag/multiset. If it wasn't there, we might have defined a data structure with a "better" interface and implementation.

I like the idea of removing del, but just because it really feels unnecessary. You could have switched to calling pop, but decided to change the way you check the presence of guess_char inside answer_chars, which IMHO is more in line with how Counter works.

I still like the additional data structures I introduced, both because they are specific to the second iteration (as you said) and because in the case of the winning guess they are empty and you never enter that iteration. But maybe your version is more streamlined, I can't really say. I think a middle-ground version could be written, where the Counter is used like I did, but there is no additional list and the second loop runs on the (possibly partial) result as you did. But, oh well, to each their own I guess.

Thanks for the conversation. I will gladly read your reflections on the bug once you update the document.

OverkillGuy commented 2 years ago

Just merged PR #2 fixing this bug, and published the associated addendum on the narrative, which I released as v1.1.0 tag. I hope you enjoy the few extra words that your report allowed me to write.

Thanks again for taking the time to report such a flagrant error, and for discussing solutions, that was very helpful.

Have a nice day, Jb

gpiancastelli commented 2 years ago

Hi again, thanks for notifying me!

Just a further heads-up. You devoted a paragraph to the use of get in the following line:

if answer_chars.get(guess_char, 0) > 0:

to catch both the “key is not set” as well as “key is set to zero” cases in the same way.

However, answer_chars is not a simple dict; it's a Counter. And the documentation says that "Counter objects have a dictionary interface except that they return a zero count for missing items." So your line could just be written as:

if answer_chars[guess_char] > 0:

If I may, I would suggest to use Counter by its proper protocol, make the reader notice it, and enrich the discussion by citing the existence of get for "raw" dict objects.

OverkillGuy commented 2 years ago

Just a further heads-up. [...]

Fantastic point, that makes my life easier (I was mildly annoyed at having to make the code a tiny bit less pretty for non-experts, this fixes it).

Created as https://github.com/OverkillGuy/literate-wordle/pull/3, will merge within the week after I let it cool down for an editing pass. Feel free to comment on that as we iterate.

I expect this to be a patch release, updating the changelog but not a whole new section.

OverkillGuy commented 2 years ago

Merged and deployed as v1.1.1!

Thanks again for your help.

OverkillGuy / literate-wordle

Wrong guess score with multiple copies of a letter #1