NOTE: The following is for discussion. The current implementation differs from this.
This refers to Markdown and TOML content in the dictionary-wiki, but parsing the data is implemented in this project.
TOML blocks are a data representation in a text format which is intended to be easy to read and write for humans, and can be parsed by programs.
Simple entry
This form allows us to add short entries, which still minimally function for word - definition lookup.
The definition bodies could be long, such as when imported from other dictionaries, which people are able to read and understand from the text, with at least some information captured in the metadata to be used by the program.
A simple Markdown page may be a title, some basic metadata in a TOML block, the rest of the file being the definition text of the entry, which may be a short summary or a longer description.
# upagacchati
``` toml
word = "upagacchati"
dict_label = "PTS"
[[meanings]]
summary = "to come to, go to, approach, flow to (of water)"
[[meanings]]
summary = "to undergo, go (in) to, to begin, undertake"
```
1. to come to, go to, approach, flow to (of water) DN.ii.12; Pv\-a.12 (vasanaṭṭhānaṃ), Pv\-a.29, Pv\-a.32 (vāsaṃ) Pv\-a.132; ger *\-gantvā* Pv\-a.70 (attano santikaṃ), & *\-gamma* SN.ii.17, SN.ii.20.
2. to undergo, go (in) to, to begin, undertake Snp.152 (diṭṭhiṃ anupagamma); Ja.i.106 (vassaṃ); Pv\-a.42 (id.); Ja.i.200; niddaṃ upagacchati to drop off into sleep Pv\-a.43 (aor. upagacchi MSS. ˚gañchi), Pv\-a.105, Pv\-a.128
pp. of *[upagata](upagata.md)* (q.v.).
upa + gacchati
Complex entry
The more metadata is captured in TOML, the better it may be used by the program for dictionary lookup and database processing.
The definition body may still contain long descriptions, but information such as grammatical construction and example sentences may be captured in the metadata and used in other parts of the program.
The TOML block may contain the following:
word = "upagacchati" # The dictionary word lookup entry.
word_nom_sg = "" # The nominative singular form (if applies)
dict_label = "PTS" # A label to distinguish dictionary sources or authors.
inflections = [] # Inflected or conjugated forms such as plurals, which should return this word entry.
phonetic = "" # Phonetic spelling, such as IPA.
transliteration = "" # Transliteration to Latin from other alphabets such as Thai or Chinese.
# First meaning.
[[meanings]]
# Short translation in English.
summary = "to come to, go to, approach, flow to (of water)"
synonyms = [] # Different words with similar meaning.
antonyms = [] # Opposite meanings.
variants = [] # Similar form or construction but different meaning.
also_written_as = [] # Spelling variations.
see_also = ["upagata"] # Related terms.
example_count = 2 # A helper number to mark the number of examples collected for this meaning.
# Grammar of first meaning.
[[meanings.grammar]]
pali_roots = ["upa", "gam"]
pali_root_groups = ["upa", "gam"]
pali_root_group = "1.1"
pali_root_sign = "a"
prefix_and_root = "ā bhuj" # e.g. for ābhujati
construction = "upa + gaccha + ti"
base_construction = "gam + a = gaccha" # Root and conjugation sign
compound_type = ""
compound_construction = ""
sanskrit_word = ""
sanskrit_roots = []
comment = "pp. of upagata" # General grammar comment.
speech = "verb" # Part of speech.
case = "acc." # Specific grammar properties.
num = ""
gender = ""
person = ""
voice = ""
object = ""
transitive = "trans." # trans. / intrans. / ditrans. / empty
negative = "" # true / false / empty
verb = "" # causative / passive / denominate / intensive / empty
# First example of first meaning.
[[meanings.examples]]
sutta_ref = ""
sutta_title = "paṭhama dārukkhandhopamasuttaṃ"
text_pali = "evam'eva kho, bhikkhave, sace tumhe'pi na orimaṃ tīraṃ upagacchatha, na pārimaṃ tīraṃ upagacchatha..."
text_english = ""
# Second example of first meaning.
[[meanings.examples]]
sutta_ref = "..."
# Second meaning.
[[meanings]]
summary = "to undergo, go (in) to, to begin, undertake"
# First example of second meaning.
[[meanings.examples]]
sutta_ref = "SN 12.61"
sutta_title = "assutavāsuttaṃ"
text_pali = "varaṃ bhikkhave assutavā puthujjano imaṃ cātumahābhūtikaṃ kāyaṃ attato upagaccheyya na tv'eva cittaṃ"
text_english = ""
Spreadsheet columns
If you are using a spreadsheet to collect words, use the following headers which can be parsed back to the above TOML.
If you have more than one meaning for a word, add a new row and use the meaning_order number (1, 2, etc.).
NOTE: The following is for discussion. The current implementation differs from this.
This refers to Markdown and TOML content in the dictionary-wiki, but parsing the data is implemented in this project.
TOML blocks are a data representation in a text format which is intended to be easy to read and write for humans, and can be parsed by programs.
Simple entry
This form allows us to add short entries, which still minimally function for word - definition lookup.
The definition bodies could be long, such as when imported from other dictionaries, which people are able to read and understand from the text, with at least some information captured in the metadata to be used by the program.
A simple Markdown page may be a title, some basic metadata in a TOML block, the rest of the file being the definition text of the entry, which may be a short summary or a longer description.
Complex entry
The more metadata is captured in TOML, the better it may be used by the program for dictionary lookup and database processing.
The definition body may still contain long descriptions, but information such as grammatical construction and example sentences may be captured in the metadata and used in other parts of the program.
The TOML block may contain the following:
Spreadsheet columns
If you are using a spreadsheet to collect words, use the following headers which can be parsed back to the above TOML.
If you have more than one meaning for a word, add a new row and use the
meaning_order
number (1, 2, etc.).