barrust / pyspellchecker

Pure Python Spell Checking http://pyspellchecker.readthedocs.io/en/latest/
MIT License
694 stars 101 forks source link

Custom Dictionary not loading #88

Closed barrust closed 3 years ago

barrust commented 3 years ago

Hello Mr. @barrust , I want to build my own dictionary. I already had the list of words in .txt. I didn't know how to set the distance so I decided to assign 1 for each of the words, convert them to .json then compress them to .gz. I put the .json.gz file to scripts/data. When I tried to spellcheck, the word that supposed to be correct is incorrect. Is there any way to set up a dictionary based on a list of words?

note: I apologize in advance because I only have a basic understanding of AI and relatively slow in programming. Capture image

Thank you. 😊

Originally posted by @paularon in https://github.com/barrust/pyspellchecker/discussions/76#discussioncomment-340012

barrust commented 3 years ago

@paularon this is more of an issue than a general discussion (so I moved it into an issue!). To be able to help, I will need more information on your setup and how you loaded in the dictionary. It sounds, based on inference, that you only placed the dictionary in scripts/data/ but nothing is actually loaded from there. That is just the data that I used to build the dictionaries. You will still need to load the data using one of the many different methods.

Assuming the new dictionary is in scripts/data/:

from spellchecker import SpellChecker
spell = SpellChecker(language=None, local_dictionary="scripts/data/my-dictionary.json.gz") 
# use your dictionary

If you wanted to use the text file you can do that like so:

from spellchecker import SpellChecker 
spell = SpellChecker(language=None)  # we don't want a general language!
spell.word_frequency.load_txt("path_to_my_file.txt") 
# use your new dictionary 
spell.export("new_dictionary.json.gz")  # export it to be used again in the future 

I hope this is helpful. Please let me know!