ilius / pyglossary

A tool for converting dictionary files aka glossaries. Mainly to help use our offline glossaries in any Open Source dictionary we like on any modern operating system / device.
GNU General Public License v3.0
2.26k stars 237 forks source link

Support reading Warodai (UTF-16 text, Japanese-Russian) #569

Open GrimPixel opened 5 months ago

GrimPixel commented 5 months ago

Would you consider supporting the txt format of Warodai?

banditto9 commented 1 month ago

Hello and many thanks for your wonderful tool! Let me join to the topic starter as Warodai dictionary in text format is the most updated type and a universal one. Personally I'm looking forward for this support as my target conversion to epub and mobi so to use it on Kindle ereader. Thank you.

banditto9 commented 1 month ago

Let me put small addition here. Warodai project also publishes a link to a contributor who makes an EDICT format from the text: https://github.com/update692/warodai-to-edict As a result we get "output.txt" which looks close to a CSV file but with "/" delimeter. This makes the conversion easier I think. Support to an old but still existing EDICT (not EDICT2) format can be added to the PyGlossary project at once. Thank you for your time and attention.

soshial commented 1 month ago

Just use the StarDict version and convert it to the desired format, @banditto9

banditto9 commented 1 month ago

@soshial hello and thanks for your tip. I know it exists, but the version is old enough - txt is updated on more frequent basis (months vs years) as 3rd party conversions are rarely performed.

soshial commented 1 month ago

Noone is going to develop for a single 1-off case, especially for free. It's simply not realistic

banditto9 commented 1 month ago

@soshial I see, I'd be happy to make private fork, but, unfortunately I'm not a developer. In the meanwhile I was able to make some kind of a conversion via https://github.com/update692/warodai-to-edict to one line and then do separation in Excel to CSV. Hope this will be helpful somehow to the topic starter. It worked but lack of formatting makes it not very usable as everything goes as one line with no \n or \r\n.

What can be added so as not it to be a "1-off case" is the text "\r\n separator" which is missing now. As a "vertical" kind of alternative to the existing "horizontal" CSV text option. I understand this is still not a common standard for plain text dictionaries but may exist outside the Warodai.

Screenshot 2024-10-31 094418 Screenshot 2024-10-31 095528