barrust / pyspellchecker

Pure Python Spell Checking http://pyspellchecker.readthedocs.io/en/latest/
MIT License
696 stars 101 forks source link

How to add new words to the dictionary? #27

Closed alonsopg closed 5 years ago

alonsopg commented 5 years ago

From the docs, as far as I understood you can add new words as follows:

from spellchecker import SpellChecker
spell = SpellChecker(distance=1)
spell.word_frequency.load_words(['words', 'to', 'be','added', 'to', 'the', 'system', 'Addis Abeba'])

However, I am trying to correct the word "Addis Abeba", as follows and it doesn't work:

In:

misspelled = spell.unknown(['something', 'is', 'hapenning', 'here', 'Adis abebba'])
for word in misspelled:
    print(spell.correction(word))
    print(spell.candidates(word))

Out:

Adis abebba
{'Adis abebba'}
hapenning
{'hapenning'}

Thus, how can I add do my dictionary the word: "Adis abebba" in order to spell and correct words like "Addis abeba" or "Adiis abebbba"?

barrust commented 5 years ago

You may want to increase the distance to 2 and see if that works. You have it set to a distance of 1 but there are 2 letters misplaced.

alonsopg commented 5 years ago

From the beginning I tried with two and still the same:

In:


from spellchecker import SpellChecker
spell = SpellChecker(distance=2)
spell.word_frequency.load_words(['words', 'to', 'be','added', 'to', 'the', 'system', 'Addis Abeba'])

misspelled = spell.unknown(['something', 'is', 'hapenning', 'here', 'Adis abebba'])
for word in misspelled:
    print(spell.correction(word))
    print(spell.candidates(word))

Out:


happening
{'happening', 'penning', 'henning'}
Adis abebba
{'Adis abebba'}

What can I do for spell check those tokens?

barrust commented 5 years ago

Ah, I think the issue is that you are passing unknown words capitalized; could you try using those in lower case and see if that resolves your issue?

If so, the unknown and known methods should force lower case.