xxyzz / WordDumb

A calibre plugin that generates Kindle Word Wise and X-Ray files for KFX, AZW3, MOBI and EPUB eBook.
https://xxyzz.github.io/WordDumb/
GNU General Public License v3.0
376 stars 19 forks source link

Word wise not working correctly for French books #141

Closed e-zz closed 1 year ago

e-zz commented 1 year ago

Checkboxes

Describe the bug

Error message pops up after changing Lemma lang → french in Customize Word Wise.

And maybe directly because of this (or maybe not), generation of word wise failed to work properly for French books. Actually it works, but the anntations added are all wrong. And by clicking an explanation from worddumb, I won't see a french word but an English one. It's wierd. I don't know what I did wrong here. Maybe the reason behind is simply the language setting is somehow wrong, like set as English after the processing by the plugin?

System Information

OS: win10 Calibre: 6.24.0 python: 3.8 plugin ver: 3.29.5

Error message

calibre, version 6.24.0 (win32, embedded-python: True)
Tonnerre de Brest!: An error occurred, please copy error message then report bug at GitHub.

Starting job: Saving customized lemmas 
Job: "Saving customized lemmas" failed with error: 
Traceback (most recent call last):
  File "calibre\gui2\threaded_jobs.py", line 82, in start_work
  File "calibre_plugins.worddumb.config", line 357, in dump_lemmas_job
  File "calibre_plugins.worddumb.utils", line 56, in run_subprocess
  File "subprocess.py", line 524, in run
subprocess.CalledProcessError: Command '['py', 'C:\\Users\\ez\\AppData\\Roaming\\calibre\\plugins\\WordDumb.zip', '{"is_kindle": true, "db_path": "C:\\\\Users\\\\ez\\\\AppData\\\\Roaming\\\\calibre\\\\plugins\\\\worddumb-lemmas\\\\fr\\\\wiktionary_fr_en_v0.db", "lemma_lang": "fr", "plugin_path": "C:\\\\Users\\\\ez\\\\AppData\\\\Roaming\\\\calibre\\\\plugins\\\\WordDumb.zip", "model_name": "fr_core_news_md"}', '{"use_pos": true, "search_people": true, "model_size": "md", "zh_wiki_variant": "cn", "fandom": "", "add_locator_map": false, "preferred_formats": ["KFX", "AZW3", "AZW", "MOBI", "EPUB"], "use_all_formats": false, "minimal_x_ray_count": 1, "en_ipa": "ga_ipa", "zh_ipa": "pinyin", "choose_format_manually": true, "wiktionary_gloss_lang": "en", "kindle_gloss_lang": "en", "use_gpu": false, "cuda": "cu118", "last_opened_kindle_lemmas_language": "fr", "last_opened_wiktionary_lemmas_language": "fr", "use_wiktionary_for_kindle": false, "ca_wiktionary_difficulty_limit": 5, "da_wiktionary_difficulty_limit": 5, "de_wiktionary_difficulty_limit": 5, "el_wiktionary_difficulty_limit": 5, "en_wiktionary_difficulty_limit": 5, "es_wiktionary_difficulty_limit": 5, "fi_wiktionary_difficulty_limit": 5, "fr_wiktionary_difficulty_limit": 5, "hr_wiktionary_difficulty_limit": 5, "it_wiktionary_difficulty_limit": 5, "ja_wiktionary_difficulty_limit": 5, "ko_wiktionary_difficulty_limit": 5, "lt_wiktionary_difficulty_limit": 5, "mk_wiktionary_difficulty_limit": 5, "nl_wiktionary_difficulty_limit": 5, "no_wiktionary_difficulty_limit": 5, "pl_wiktionary_difficulty_limit": 5, "pt_wiktionary_difficulty_limit": 5, "ro_wiktionary_difficulty_limit": 5, "ru_wiktionary_difficulty_limit": 5, "sl_wiktionary_difficulty_limit": 5, "sv_wiktionary_difficulty_limit": 5, "uk_wiktionary_difficulty_limit": 5, "zh_wiktionary_difficulty_limit": 5}']' returned non-zero exit status 1.

Called with args: (True, WindowsPath('C:/Users/ez/AppData/Roaming/calibre/plugins/worddumb-lemmas/fr/wiktionary_fr_en_v0.db'), 'fr') {'notifications': <queue.Queue object at 0x0000015AF4C24F40>, 'abort': <threading.Event object at 0x0000015AF4C25120>, 'log': <calibre.utils.logging.GUILog object at 0x0000015AF4C251E0>} 
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\ez\AppData\Roaming\calibre\plugins\WordDumb.zip\__main__.py", line 24, in <module>
  File "C:\Users\ez\AppData\Roaming\calibre\plugins\WordDumb.zip\dump_lemmas.py", line 69, in dump_spacy_docs
KeyError: 'spacy_model'

Reproduce steps

See the picture below to reproduce it.

Screenshots or videos

image

e-zz commented 1 year ago

In short, the Lemma language seems to be fixed as English in my case .

xxyzz commented 1 year ago

Thanks for the report! The commit linked above should fix this error.

If the Use Wiktionary definition option is enabled, you have to select the word wise definition language to Chinese on your Kindle: https://xxyzz.github.io/WordDumb/usage.html#create-files

e-zz commented 1 year ago

Thanks for the report! The commit linked above should fix this error.

If the Use Wiktionary definition option is enabled, you have to select the word wise definition language to Chinese on your Kindle: https://xxyzz.github.io/WordDumb/usage.html#create-files

Hi, thanks for the quick fix. I tested your solution and now word wise works smoothly.

gloverd commented 1 year ago

I'm trying to run Word Wise on a French book using the newest artifact linked above (3.29.6) but I am not able to generate correct Gloss, and it looks like its not looking up the correct Lemma either. Since its related to this ticket (Word Wise in French), and its still open, I'm going to list it here, but I can open another ticket if needed since I'm posting a lot here.

SUMMARY: It appears that WordDumb is only looking up words that match the LEMMA in English, even if its set to French. This results in word wise only appearing for words that are shared between English and French (like "transparent"), and the Gloss that appears, is most of the time the same word repeated, or a different definition.

The result of running "Create Word Wise" on a book with the following settings:

The resulting pages look like this: image image

What it has as Lemma/Gloss are

  1. Salon / "Salon"
  2. Transparent / "Transparent"
  3. Ton / "Unite de mesure de poids . Le symbole : t"

Looking at the "Customize Kindle Word Wise" screen that appears when first setting Lemma/Gloss there are different Gloss results for these words: image image

I also picked randomly some more difficult words from those two pages that I would have expected to see defined, and they are in the customize page. So these are Lemmas that are missing (Consterner, Berceau, Incessamment, Fauteuil): image image Is this because of the length of the definition being greater than 3?

I have deleted the English and French Kindle Word Wise lemmas database files (%AppData%/Roaming/calibre/plugins/worddumb-lemmas/), and done a clean install of Calibre/Plugin during my testing.

In #136, specifically this comment you mention that

The Word Wise database file on Kindle will be overwritten. If you enable "use Wiktionary definition", new database file will be copied to Kindle if the previous database is in a different language. The Kindle's default db have to be restored manually.

I'm not sure if all this testing has corrupted something, but I want to try and get as close to a clean slate as possible

Questions:

  1. If the length of the definition is greater than the ration of 3:1, is it supposed to go to a pop up footnotes style behavior, or just not appear?
  2. Do you know how Phobooky was able to see the wordwise results in the Calibre E-book viewer? there are screenshots in #53 , and that would make this testing so much easier!
  3. How do I restore the kindles default database manually? What is the best way to do a restore to get back to a fresh installation on Calibre and Kindle if needed?
  4. Sometimes on the step "Saving Customized Lemmas" or "Generating Word Wise" it just hangs -- I've left it upwards of several hours, but end up having to close calibre and force stop python which is running in the background. Does stopping Calibre/Python corrupt anything? I wonder if it freezing in "Saving Customized Lemmas" creates a corrupted file that then gets re-used later on? That is why I'm looking for how to best do a clean re-install.