RyanRana / Elimination-Based-Wordle-Solver-Algoritham

im done losing on wordle, i can't be lacking now
MIT License
0 stars 0 forks source link

Elimination-Based-Wordle-Solver-Algoritham

A python application that solves actual wordles. Firstly uses words with maximum amount of common letters and then uses response on the guessed word to eliminate specific words that don't satisfy existing conditions and then using the new list to find which words have the most common letters, this process is repeated over and over again.

Common letters are ranked by lexico. English 5 letter word list by Donald Knuth. List published by charlesreid1 on github.

PseudoCode: https://excalidraw.com/#json=ZcQ9Ri7sJwTFFhln9I9Zk,MLXCyelYiZnbpAJ4cbx5dw

https://www.nytimes.com/games/wordle/index.html

Built by a Brooklyn programmer, Josh Wardle, Wordle is the most played game so far in 2022 despite how simple it is. Most people in America already know what this game is and how it works so I won’t waste your time explaining it, however, if you don’t know about it, you clearly have been living under a rock and I refer you to https://www.nytimes.com/games/wordle/. The purpose of this article is to present a simple method to find wordles using python by trying to eliminate every word in the English language down to one. The code for this project can be found in this GitHub repository, https://github.com/RyanRana/Elimination-Based-Wordle-Solver-Algoritham, keep in mind this article is about the theory behind this program, not the syntax itself. The first thing that needs to be done to do this is to compile a list of all possible five-letter words. This has already been done by Don Knuth on GitHub and this list of 5757 words can be found in this GitHub repository, https://github.com/charlesreid1/five-letter-words. In addition to this dataset, we need a two-column list of the letters of the alphabet and the frequency of their appearance in English words. An analysis of entries in the Concise Oxford dictionary, ignoring frequency of word use, gives an order of “EARIOTNSLCUDPMHGBFYWKVXZJQ”. The letter-frequency table below is taken from Pavel Mička’s website, which cites Robert Lewand’s Cryptological Mathematics. This list can be found here, https://en.wikipedia.org/wiki/Letter_frequency#Relative_frequencies_of_letters_in_the_English_language. The overall theory is that we need to find words that have the highest “score”, this score will be calculated by adding all the letter frequencies together. So for example EEEEE would have a score of 13*5=65. This can be done by iterating throughout the entire list of words and then iterating through all the letters in the word and adding it to a score for that word. This process needs to be repeated 5757 times for all the words whereas if we were to pull combinations from letters alone it would need to be repeated 26⁵=11881376 times or use a max elimination algorithm to start with EEEEE (max score) and find the second biggest combination (EEEEA, EEEAE, EEAEE, EAEEE, AEEEE). These words will then need to be checked with the list of words to make sure words exist, for example, EEEEE does not actually exist. Therefore the primary iterative through word list approach is the best method. After all, the words that have been scored ranked the highest word would be chosen which happens to be ADIEU. Then the word would be entered into the actual game and based on the result from the game more words would be eliminated. So if they are grey letters then the program would eliminate all words containing grey letters, if it is green then it would eliminate all words that don’t have the green letters in the exact index. If it is yellow it will eliminate all words that have yellow letters in that exact index and it would eliminate all words that don’t contain yellow letters. This would cut out MOST of the list regardless of what the output of the game is. Words would be eliminated by simple linear searching and removal from list. This process will be repeated over and over again until it gets to only one word, this would be the final answer to the wordle solver. However in the OFF chance it doesn’t get to one word the alogiritham will just pick the word with the highest score. And that is the theory behind a computerized algoritham to solve wordles almost every time. Actual Project: https://github.com/RyanRana/Elimination-Based-Wordle-Solver-Algoritham