en-wl / wordlist

SCOWL (and friends).
http://wordlist.aspell.net
Other
394 stars 79 forks source link

SCOWLv2 make hunspell `--deaccent` not working #404

Closed helloliuyiming closed 2 months ago

helloliuyiming commented 2 months ago

Dear Kevina

Thank you for your outstanding work. However, I've encountered some issues while using SCOWL. Initially, I searched for a v2 release but was unable to find one. Consequently, I decided to generate a dictionary myself.

I followed these steps:

  1. Ran make
  2. Generated a word list using: ./scowl word-list scowl.db --spellings A,B,Z,C,D --variant-level '~' --size 80 --deaccent > all-english-words.txt
  3. Changed directory: cd speller
  4. Generated the Hunspell dictionary: make hunspell

While I was able to generate the Hunspell dictionary, the results were not as expected. I anticipated a dictionary without accents, but the output still contained accented words. Some examples include:

"éclair/SM éclaircissement éclat élan émigré/SM épée/MS étagère/MS étude/S étui/SM"

I'm unsure whether this is due to incorrect usage on my part or if it's a bug in the system. I'm also uncertain about how to correct this issue.

Could you please provide some insight into this problem and advise on how to generate a Hunspell dictionary without accented characters?

Best Regards. Liu

kevina commented 2 months ago

Hi,

make hunspell will create the standard dictionaries, not a custom one.

To create a custom dictionary from a wordlist use this command:

cd speller
cat ../all-english-words.txt | ./make-hunspell-dict -one en-custom /dev/null

The results will be in hunspell-en-custom.zip

Kevin

helloliuyiming commented 2 months ago

Hi,

make hunspell will create the standard dictionaries, not a custom one.

To create a custom dictionary from a wordlist use this command:

cd speller
cat ../all-english-words.txt | ./make-hunspell-dict -one en-custom /dev/null

The results will be in hunspell-en-custom.zip

Kevin

It's worked, thank you.