alex1770 / wordle

Finding optimal play in the game of wordle
MIT License
53 stars 16 forks source link

Word lists with non-Latin characters? #2

Open mourner opened 2 years ago

mourner commented 2 years ago

Hi @alex1770! I'd like to use your wonderful tool to find optimal guesses for localized versions of Wordle (Ukrainian in this case), but it seems like the current code has English alphabet hardcoded — supplying wordlists with Cyrillic letters makes the progam fail (no-legal-guess). How hard would it be to extend to work with non-Latin letters?

alex1770 commented 2 years ago

Hi Vladimir! Thank you for your kind words. In principle it shouldn't be too hard to adapt this. There is some input and output using C++ strings which could be modified to UTF-8. The only very slight niggle is that in hard mode it uses packed bit vectors to test for legal guesses. (E.g., in hard mode if you guess ABACK and this scores green,black,yellow,yellow,black, then subsequent guesses need to have at least 2 As and 1 C.) It does this by making a vector of multiplicities (0-3) of each letter and packing the 2-bit quantities into a single 64 bit integer (using the method referred to here). As I understand it from Wikipedia, Ukrainian Cyrillic has 33 letters which wouldn't quite fit into 64 bits unless you happened to know that two of the letters can only occur at most once, in which case it could be done because they'd only require 1 bit each. This isn't a big problem - just that it would require some slightly different code and would run a tiny bit slower.

mourner commented 2 years ago

Thanks so much for a thorough response! Another thing that I tried is simply mapping Ukrainian Cyrillic to Latin (using both small and big letters) and feeding the resulting word lists, but getting this failure after one first guess:

Assertion failed: (depthonly||inc>=lb[s]), function sumoverpartitions, file wordle.cpp, line 434.

alex1770 commented 2 years ago

It's not exactly meant to work with mixed case Latin letters like this, but if you send the files and say what command you used then I can try to make it work.