camarm-dev / remede

Open Source and free alternative to Antidote. A french dictionary.
https://remede.camarm.fr/
Other
6 stars 0 forks source link

📘 [Wordlist] Add a wiktionary crawler #167

Closed camarm-dev closed 2 months ago

camarm-dev commented 3 months ago

Crawl the Wiktionary for all word pages and add words and their phonetic to wordlist:

  1. Crawl a word
  2. Is the word already in the wordlist (and abort) ?
  3. Add to a temporary file, so scripts/add_word.py can perform a fast insertion to the Remède database
camarm-dev commented 3 months ago

This is not a good idea: many pages are not usefull for a dictionary like Remède... maybe find another wordlist

camarm-dev commented 3 months ago

Test with https://github.com/lorenbrichter/Words/blob/master/Words/fr.txt