seperman / fast-autocomplete

Fast Autocomplete: When Elastcsearch suggestions are not fast and flexible enough
MIT License
272 stars 40 forks source link

Autocomplete doesn't work for words with starting characters outside a-z #33

Closed joakimwar closed 2 years ago

joakimwar commented 2 years ago

Describe the bug The AutoComplete object does not find words that start with e.g. "æ, ø, å, ö"

To Reproduce

from fast_autocomplete import AutoComplete

def construct_autocomplete_dict(vocab: list, freq: list) -> dict:
    return {word: {"count": freq} for word, freq in zip(vocab, freq)}

ac_dict = construct_autocomplete_dict(['ål', 'øver', 'ære', 'ere', 'öra'], [1,2,3,4,5])
ac = AutoComplete(ac_dict)
print(ac.search('æ'))
print(ac.search('ö'))

Expected behavior Autocomplete finds the words "ære" and "öra". Instead it returns an empty list. Same for other words starting with letters like these.

OS, DeepDiff version and Python version (please complete the following information):

joakimwar commented 2 years ago

Oops, I didn't read the whole readme, and you can actually pass it extra valid characters 😳