mammothb / symspellpy

Python port of SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm
MIT License
800 stars 122 forks source link

How to empty the dictionary quickly #125

Closed FrankDataAnalystPython closed 2 years ago

FrankDataAnalystPython commented 2 years ago

Dear Sir: I am just wondering is there a way to empty the .words dict quickly in an inexpensive way after loading dictionary.

I am facing a case where I need to constantly update the dictionary of the symspell, but it is very slow for each initialization

Regards

FrankDataAnalystPython commented 2 years ago

Dear Sir: I am also wondering, is there a way to just focus on a subset of the symspell.words for calculating the distance? Many Thanks

Regards

mammothb commented 2 years ago

I am just wondering is there a way to empty the .words dict quickly in an inexpensive way after loading dictionary.

I am facing a case where I need to constantly update the dictionary of the symspell, but it is very slow for each initialization

The words and other related data are found within these member variables and are implemented as dictionary. So I think you can probably just use the Python built-in dictionary clear() function.

I am also wondering, is there a way to just focus on a subset of the symspell.words for calculating the distance?

You'll have to write your custom functions/methods for this. You can perhaps try to load a subset of the dictionary words before running lookup methods.

To make switching of different dictionary "subset" quicker, I think you can save the words and related member variables as pickles and then just load them instead of having to parse them again. You can refer to the _load_pickle_stream() method to see which member variables should be saved and loaded

FrankDataAnalystPython commented 2 years ago

Many Thanks!!!