ilius / pyglossary

A tool for converting dictionary files aka glossaries. Mainly to help use our offline glossaries in any Open Source dictionary we like on any modern operating system / device.
GNU General Public License v3.0
2.23k stars 237 forks source link

Error converting Stardict to Kobo #506

Closed ksignorini closed 1 year ago

ksignorini commented 1 year ago

When trying to convert a Stardict dictionary to Kobo using the cmd interface, I get the following error:

Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pyglossary/glossary_v2.py", line 570, in _openReader openResult = reader.open(filename) File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pyglossary/plugins/stardict.py", line 317, in open self.readIfoFile() File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pyglossary/plugins/stardict.py", line 354, in readIfoFile for line in ifoFile: File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/codecs.py", line 322, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0x92 in position 197: invalid start byte

ilius commented 1 year ago

Can you upload your .ifo file?

ksignorini commented 1 year ago

Here it is.

Elderlings_v6.ifo.txt

ksignorini commented 1 year ago

This is happening with a number of other ones as well, even after I convert the .ifo and .dict files to UTF-8.

ilius commented 1 year ago

The ifo file is not UTF-8. Where did you download them from? Do they work in GoldenDict or StarDict?

BTW, Modifying or re-coding .dict file will most likely break the glossary (because it would change file positions that are stored in .index file).

ksignorini commented 1 year ago

Fictionary.net

Using the read encoding option doesn’t seem to always work either (as in this post: https://github.com/ilius/pyglossary/issues/309)

Here is the full dictionary file...

Elderlings_Fictionary_6.zip

ksignorini commented 1 year ago

I also just checked in GoldenDict and it works perfectly in GoldenDict.

ilius commented 1 year ago

Please try again with --read-options unicode_errors=replace

ksignorini commented 1 year ago

Generating the Kobo format dictionary now works without erroring out. However, after zipping the folder and installing on the Kobo, the dictionary doesn't work. None of the words I've tried are found in the dictionary.

ksignorini commented 1 year ago

This is the command I'm using to launch:

~/pyglossary-master/main.py --source-lang English --target-lang English --read-options unicode_errors=replace --cmd

And on subsequent runs, as instructed by pyglossary:

pyglossary Elderlings_v6.ifo dicthtml-Fictionary-Elderlings --read-format=Stardict --write-format=Kobo --json-read-options '{"unicode_errors": "replace"}' --source-lang=English --target-lang=English

Are these correct?

(I'm running on macOS currently.)

ilius commented 1 year ago

You can create a new issue. Although I don't have a Kobo, so I can't help you myself with no other info.