digitalpalidictionary / dpd-db

13 stars 8 forks source link

How to build DPD for Simsapa #20

Closed gambhiro closed 3 months ago

gambhiro commented 9 months ago

I am primarily interested in generating the complete DPD dictionary in SQLite, including the compound breakup and word lookup facilities. Currently I run a migration script on dpd.db which adds the .json results in the db.

So perhaps we should limit the scope of #18 to just this?

makedict.sh has the following sequence:

inflections/inflections_to_headwords.py
grammar_dict/grammar_dict.py

exporter/exporter.py
exporter/deconstructor_exporter.py

exporter/tpr_exporter.py

Questions:

Is the grammar_dict.py now obsoleted in favour of the deconstructor_exporter.py, or perhaps it does something different?

Is the deconstructor_exporter.py dependent on exporter.py, or it can be exectuted separately?

The tpr_exporter.py::generate_sandhi_data() produces a headword to breakup dict. Is this different from the deconstructor?

Devamitta commented 9 months ago

As I am aware grammar_dict/grammar_dict.py exporter/exporter.py exporter/deconstructor_exporter.py exporter/tpr_exporter.py are independent of each other, but depend on db components from generating_components.sh again, file relationship may help you find out which exact components require for each dictionary.

Is the grammar_dict.py now obsoleted in favour of the deconstructor_exporter.py, or perhaps it does something different?

dpd-deconstructor and dpd-grammar - are 2 separate dictionaries which have different purposes. deconstructor brakes sandhi. grammar shows declenation or conjugation of the word. image

image