JoeBussard / polywordleapi

the api for playing polywordle
0 stars 0 forks source link

Word lookup is O(n) time #49

Open JoeBussard opened 2 years ago

JoeBussard commented 2 years ago

Word lookup is O(N) time.

The dictionaries for common words and "all" words are stored on disk as JSON objects with the schema {"":""}.

Checking if a word exists in one of those dictionaries means converting its values to a list and iterating through each item in the list, i.e. new_guess in list(all_words.values()).

Hashing the words themselves would make it O(1) to check if a word is in the dictionary. That is literally why it exists.

Note: This was useful in legacy systems when I wanted to be able to access words from the dictionary via numerical ID rather than array index. Removing words from the dictionary meant shifting the array index of every word after it. That functionality is no longer needed, or implemented, and there are more elegant and resource-efficient ways to achieve that.

JoeBussard commented 2 years ago

This is in backend_run_game.py at line 48.

JoeBussard commented 2 years ago

Best solution is probably a quick script to load common_words.json and all_words.json and store just the words (values) in space-separated plaintext file.

The way words are loaded from common_words.json when creating a new game does not fit with the current paradigm. It chooses a random integer value from 0-N and picks the word with that index. This is a holdover from a previous method of assigning an ID to a game solution. Now that we use UUID, assigning an integer key to every possible word is no longer needed.

Common words should be stored as a set. Method set_random_solution() in backend_create_new_game.py line 88 should be changed to use random.choice(common_words) instead of the random index process.