barrust / pyspellchecker

Pure Python Spell Checking http://pyspellchecker.readthedocs.io/en/latest/
MIT License
714 stars 164 forks source link

Spanish words are not corrected when they missed a tilde (`) #65

Closed benjamimo1 closed 3 years ago

benjamimo1 commented 4 years ago

Cancion != Canción

barrust commented 4 years ago

So, canción is in the dictionary but so is cancion:

from spellchecker import SpellChecker

sc = SpellChecker(language='es')

print('canción' in sc)  # should be true
print('cancion' in sc)
print(sc.word_frequency['canción'])
print(sc.word_frequency['cancion'])

We can always clean up and remove words that are not truly correct (such as cancion). The dictionaries were built using an external project. Any support on finding and removing incorrect, or likely incorrect words, would be appreciated.

benjamimo1 commented 4 years ago

thanks for your reply, although I don't have the time I appreciate the project and anyone who could be able to improve it.