ilius / pyglossary

A tool for converting dictionary files aka glossaries. Mainly to help use our offline glossaries in any Open Source dictionary we like on any modern operating system / device.
GNU General Public License v3.0
2.26k stars 237 forks source link

Zim to Stardict: FileNotFoundError #585

Closed Steven630 closed 1 month ago

Steven630 commented 2 months ago

I have updated to 4.7.1 and got the following error:

[INFO] Automatically switching to SQLite mode for writing Stardict [INFO] Using sortKeyName = 'stardict' [INFO] Removing and re-creating 'C:\Users\64087\AppData\Local\PyGlossary\Cache\wiktionary_en_simple_all_nopic_2024-06.zim.db' [WARNING] Unsupported operating system (no os.statvfs) [WARNING] Unrecognized mimetype='undefined' [WARNING] Unrecognized mimetype='undefined' [WARNING] Unrecognized mimetype='undefined' [ERROR] resource title: mw/skins.minerva.base.reset|skins.minerva.content.styles|ext.cite.style|site.styles|mobile.app.pagestyles.android|mediawiki.page.gallery.styles|mediawiki.skinning.content.parsoid.css [ERROR] Exception in Tkinter callback: Traceback (most recent call last): File "C:\Users\64087\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyglossary\ui\ui_tk.py", line 197, in CallWrappercall return self.func(args) ^^^^^^^^^^^^^^^^ File "C:\Users\64087\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyglossary\ui\ui_tk.py", line 1489, in convert finalOutputFile = self.glos.convert( ^^^^^^^^^^^^^^^^^^ File "C:\Users\64087\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyglossary\glossary_v2.py", line 1274, in convert return self.convertV2(args) ^^^^^^^^^^^^^^^^^^^^ File "C:\Users\64087\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyglossary\glossary_v2.py", line 1216, in convertV2 sort = self._convertPrepare( ^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\64087\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyglossary\glossary_v2.py", line 1170, in _convertPrepare if not self._read( ^^^^^^^^^^^ File "C:\Users\64087\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyglossary\glossary_v2.py", line 767, in _read self.loadReader(reader) File "C:\Users\64087\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyglossary\glossary_v2.py", line 785, in loadReader for entry in self.applyEntryFiltersGen(reader): File "C:\Users\64087\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyglossary\glossary_v2.py", line 439, in applyEntryFiltersGen for entry in gen: File "C:\Users\64087\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyglossary\plugins\zimfile.py", line 198, in iter yield glos.newDataEntry(word, b_content) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\64087\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyglossary\glossary_v2.py", line 613, in newDataEntry return DataEntry( ^^^^^^^^^^ File "C:\Users\64087\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyglossary\entry.py", line 55, in init with open(tmpPath, "wb") as toFile: ^^^^^^^^^^^^^^^^^^^ FileNotFoundError: [Errno 2] No such file or directory: 'C:\Users\64087\AppData\Local\PyGlossary\Cache\wiktionary_en_simple_all_nopic_2024-06.zim_res\mw_skins.minerva.base.reset|skins.minerva.content.styles|ext.cite.style|site.styles|mobile.app.pagestyles.android|mediawiki.page.gallery.styles|mediawiki.skinning.content.parsoid.css' Traceback (most recent call last): File "C:\Users\64087\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyglossary\ui\ui_tk.py", line 197, in CallWrapper__call return self.func(args) ^^^^^^^^^^^^^^^^ File "C:\Users\64087\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyglossary\ui\ui_tk.py", line 1489, in convert finalOutputFile = self.glos.convert( ^^^^^^^^^^^^^^^^^^ File "C:\Users\64087\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyglossary\glossary_v2.py", line 1274, in convert return self.convertV2(args) ^^^^^^^^^^^^^^^^^^^^ File "C:\Users\64087\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyglossary\glossary_v2.py", line 1216, in convertV2 sort = self._convertPrepare( ^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\64087\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyglossary\glossary_v2.py", line 1170, in _convertPrepare if not self._read( ^^^^^^^^^^^ File "C:\Users\64087\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyglossary\glossary_v2.py", line 767, in _read self.loadReader(reader) File "C:\Users\64087\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyglossary\glossary_v2.py", line 785, in loadReader for entry in self._applyEntryFiltersGen(reader): File "C:\Users\64087\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyglossary\glossary_v2.py", line 439, in _applyEntryFiltersGen for entry in gen: File "C:\Users\64087\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyglossary\plugins\zimfile.py", line 198, in iter yield glos.newDataEntry(word, b_content) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\64087\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyglossary\glossary_v2.py", line 613, in newDataEntry return DataEntry( ^^^^^^^^^^ File "C:\Users\64087\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyglossary\entry.py", line 55, in init with open(tmpPath, "wb") as toFile: ^^^^^^^^^^^^^^^^^^^ FileNotFoundError: [Errno 2] No such file or directory: 'C:\Users\64087\AppData\Local\PyGlossary\Cache\wiktionary_en_simple_all_nopic_2024-06.zim_res\mw_skins.minerva.base.reset|skins.minerva.content.styles|ext.cite.style|site.styles|mobile.app.pagestyles.android|mediawiki.page.gallery.styles|mediawiki.skinning.content.parsoid.css'

ilius commented 2 months ago

I pushed a fix. Please try again.

Steven630 commented 2 months ago

Sorry, I still can't upgrade. Yesterday I managed to upgrade only because you released a new version. Downloading the zip file and double-clicking "main.py" does not have any effect.

ilius commented 2 months ago

Try Open with... then choose python.exe from Python installation dir.

ilius commented 2 months ago

If you need help, email me.

Steven630 commented 2 months ago

If you need help, email me.

Thank you. I still could not update to the latest version. I don't have desktop at hand recently and will try again.

ilius commented 1 month ago

I published this tag: https://pypi.org/project/pyglossary/4.8.0rc0/

Install with pip install pyglossary==4.8.0rc0

You will need to add --pre flag if you are not giving the version.

Steven630 commented 1 month ago

Thank you. With the latest version, the following errors occur:

https://gist.github.com/ilius/6265667667acd855d233ce3b1eaaa705

ilius commented 1 month ago

Please add these flags to your command and try again: -v0 --read-options=text_unicode_errors=ignore

Steven630 commented 1 month ago

I use the Tkinter interface. How do I add these flags?

ilius commented 1 month ago

Click on first Options button after selecting zim file name. Then click on Value cell for text_unicode_errors and select ignore. Then click OK and Convert.

Steven630 commented 1 month ago

Thank you. The conversion was successful.

Steven630 commented 1 month ago

I just converted another bigger file (800mb) and got the following error (I suspect the file generated is not complete)

[WARNING] Unrecognized mimetype='image/svg+xml; charset=utf-8; profile="https://www.mediawiki.org/wiki/Specs/SVG/1.0.0"' [INFO] ZIM Entry Count: 356562 [ERROR] Files with name too long: 0 [INFO] Empty Content Count: 7 [INFO] Redirect Count: 98746 [INFO] Writing to Stardict file 'C:\Users\64087\Downloads\wikipedia_simple.ifo' [INFO] Sorting took 1.0 seconds [INFO] Auto-selecting sametypesequence=h [ERROR] StarDict: dictMark = 4294971195 is too big, set option large_file=true [INFO] Sorting 306001 items... [INFO] Sorting 306001 C:\Users\64087\Downloads\wikipedia_simple.idx took 0.09 seconds [INFO] Writing 306001 index entries... [INFO] Writing 306001 C:\Users\64087\Downloads\wikipedia_simple.idx took 0.09 seconds [INFO] Writing dict file took 28.61 seconds [ERROR] Exception in Tkinter callback: Traceback (most recent call last): File "C:\Users\64087\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyglossary\ui\ui_tk.py", line 198, in CallWrappercall return self.func(args) ^^^^^^^^^^^^^^^^ File "C:\Users\64087\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyglossary\ui\ui_tk.py", line 1490, in convert finalOutputFile = self.glos.convert( ^^^^^^^^^^^^^^^^^^ File "C:\Users\64087\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyglossary\glossary_v2.py", line 1234, in convert return self.convertV2(args) ^^^^^^^^^^^^^^^^^^^^ File "C:\Users\64087\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyglossary\glossary_v2.py", line 1198, in convertV2 finalOutputFile = self._write( ^^^^^^^^^^^^ File "C:\Users\64087\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyglossary\glossary_v2.py", line 933, in _write self._writeEntries(writerList, filename) File "C:\Users\64087\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyglossary\glossary_v2.py", line 876, in _writeEntries gen.send(entry) StopIteration Traceback (most recent call last): File "C:\Users\64087\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyglossary\ui\ui_tk.py", line 198, in CallWrappercall return self.func(args) ^^^^^^^^^^^^^^^^ File "C:\Users\64087\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyglossary\ui\ui_tk.py", line 1490, in convert finalOutputFile = self.glos.convert( ^^^^^^^^^^^^^^^^^^ File "C:\Users\64087\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyglossary\glossary_v2.py", line 1234, in convert return self.convertV2(args) ^^^^^^^^^^^^^^^^^^^^ File "C:\Users\64087\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyglossary\glossary_v2.py", line 1198, in convertV2 finalOutputFile = self._write( ^^^^^^^^^^^^ File "C:\Users\64087\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyglossary\glossary_v2.py", line 933, in _write self._writeEntries(writerList, filename) File "C:\Users\64087\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyglossary\glossary_v2.py", line 876, in _writeEntries gen.send(entry) StopIteration

ilius commented 1 month ago

Click on the second Options button (right to Output format), and click on large_file to change to true, and try again.

Steven630 commented 1 month ago

It worked! Though look-ups in KOreader did not return results even though the converted dictionary was recognized.

ilius commented 1 month ago

For KOReader / sdcv you have to set merge_syns=True in StarDict Options.

Steven630 commented 1 month ago

For KOReader / sdcv you have to set merge_syns=True in StarDict Options.

Yes, I have already done that. But this one does not work (unlike the smaller wiktionary conversion). Seems that the file does not have synonyms to begin with.

ilius commented 1 month ago

Oh, maybe sdcv does not support these large files (64-bit index). In that case, the only option is to split it up into several dicts.

ilius commented 1 month ago

This issue is getting long. Please open a new issue for this if you like.