wolfgarbe / SymSpell

SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm
https://seekstorm.com/blog/1000x-spelling-correction/
MIT License
3.12k stars 284 forks source link

[Question] SymSpell How can I use SymSpell to check French language? (I think there's a missing file?) #91

Closed MyProjectsPage closed 4 years ago

MyProjectsPage commented 4 years ago

Hi,

Thanks for the great work. I need to use SymSpell for French. The English version needs 2 files: frequency_bigramdictionary_en_243_342.txt frequency_dictionary_en_82_765

In SymSpell.FrequencyDictionary directory there's only one file for French: fr-100k.txt (which is the dictionary) My question is where I can find frequency_bigramdictionary for French? It seems to me that one file is missing? Thanks :-)

wolfgarbe commented 4 years ago

The bigram dictionary is optional.

It is used solely in LookupCompound to improve the correction quality - but it will work also without loading the bigram dictionary.

Lookup and WordSegmentation do not use the bigram dictionary at all.

You could create your own French bigram dictionary, e.g. by using the Google Books Ngram data

MyProjectsPage commented 4 years ago

Got it! Thank you very much for your punctual reply :-)