digitalpalidictionary / dpd-db

12 stars 7 forks source link

Pyglossary exporter #9

Closed bergentroll closed 8 months ago

bergentroll commented 11 months ago

Replace tools/stardict.py export with PyGlossary export.

Rationale:

Notes:

Other changes:

I hope to do some work on exporters for DPS and alternative formats in future.

:pray: :pray: :pray:

gambhiro commented 10 months ago

Dear Anton @bergentroll , did you verify that the pyglossary stardict exporter include the necessary formatting, e.g. inflection definitions, html pages, etc. in the output?

I can see it might be useful to export to other formats, and perhaps that is your motivation.

I don't mind if you don't need that little module from Simsapa, but I specifically added it because the pyglossary exporter only wrote the headword and definition fields for stardict format, ignoring the inflections.

Also, the dictzip dependency was a problem when trying to run the export on Windows.

bergentroll commented 10 months ago

Bhante Gambhīro :pray: :pray: :pray:

Dear Anton @bergentroll , did you verify that the pyglossary stardict exporter include the necessary formatting, e.g. inflection definitions, html pages, etc. in the output?

I try to not miss something. For now output seems fine: there is exactly the same amount of word entries and html-formatted definitions including inflections and special entries (like "thanks") as in stardict.py output. If bhante may, please check that :file_folder:build is appropriate indeed. :pray:

... I specifically added it because the pyglossary exporter only wrote the headword and definition fields for stardict format, ignoring the inflections

PyGlossary takes no special synonyms field, instead list of strings must be passed to bind headers as "synonyms". Initially I was stuck with it, but now it is OK.

I can see it might be useful to export to other formats, and perhaps that is your motivation.

Yes, and other motivation is to keep a bit less code to decrease complexity of the project.

Also, the dictzip dependency was a problem when trying to run the export on Windows.

There is an idea to implement compression with idzip for PyGlossary. And if it will not be in PyGlossary, it should be not hard to use it in the pyglossary_exporter.py just before zipping step.

gambhiro commented 10 months ago

The missing synonyms field stood out for me when I was testing pyglossary, but it seems you found a solution for that.

bergentroll commented 10 months ago

No more /usr/bin/dictzip dependency.