ilius / pyglossary

A tool for converting dictionary files aka glossaries. Mainly to help use our offline glossaries in any Open Source dictionary we like on any modern operating system / device.
GNU General Public License v3.0
2.23k stars 237 forks source link

Data file is corrupted. Word "..." #22

Closed ratijas closed 8 years ago

ratijas commented 8 years ago

for every article in stardict *.ifo file i'm getting Data file is corrupted. Word "..."

python2 ~/projects/pyglossary/pyglossary.pyw --ui=cmd PhraseBookRuEs.ifo txt dictionary taken from here: http://rutracker.org/forum/viewtopic.php?p=69468986 .idx.gz file gunziped, .dict.dz file renamed to *.dict.gz and gunziped. note that unpacked in such way dictionaries working well in "dictionary universal" (iOS app).

tuxor1337 commented 8 years ago

Okay, and if you don't unzip the files, then it's working? So why are you unzipping these files?

tuxor1337 commented 8 years ago

Now I downloaded the file from your link and I'm not able to reproduce this error message. Everything is doing fine independently of whether I gunzip first or not.

Oh, almost forgot to mention this: I changed the line sametypesequence=x in the *.ifo file to sametypesequence=h. Maybe we should support xdxf in the future, since it's basically a harmless xml format. That would just mean adding an x here: https://github.com/ilius/pyglossary/blob/master/pyglossary/plugins/stardict.py#L244

ratijas commented 8 years ago

@tuxor1337 well, uhmm… i'm only experienced in dsl and apple .dictionary formats. so what you have said for me sounds like “idk your problem, it rocks for me! although i hacked source code in one byte haha, but you'll never get it!”

PS it finally works. after digging in implementations and figuring out origins of 4 not obvious errors or something. for example, this great one appears if i don't specify output file name, but use --write-format=AppleDict:

File "/Volumes/DataHD/users/ivan/projects/py/pyglossary/ui/ui_cmd.py", line 189, in run
    opath = os.path.splitext(ipath)[0] + ext
TypeError: cannot concatenate 'str' and 'tuple' objects
tuxor1337 commented 8 years ago

The error message that you mention in your first post is definitely not related to the value of sametypesequence. I'm not able to reproduce your first error message even with sametypesequence=x.

So, does your pull request #23 solve this issue for you?

ilius commented 8 years ago

Is this resolved?

ratijas commented 8 years ago

after 1617db05362f98afca4528310a6196275cded604 i consider it resolved. thanks.

although, there a lot work to do with dsl and apple dict modules. the former needs ability to handle badly broken data, the second needs to be improved. after i've got failed make, i tested .xml with jing relax ng validator, and i've got 208 MB (!) errors report on 18 MB apple dict .xml. i'm gonna do few pull requests soon.

ilius commented 8 years ago

Thanks I appreciate if you separate them into micro-commits, and add a new pull request for each one.

ratijas commented 8 years ago

i did one commit, and pull request, right after that i did two more commits, but they appeared in the same pull request automagically. btw, level of my git-skill: “if something goes wrong, rm -rf and git checkout.” i really gotta read “progit” book.

but i would do anything to these commits if you tell me how.