nominals/lexnorm-all2.txt

sanskrit-lexicon / MWinflect

Generate declensions and conjugations based upon words in MW1899 dictionary.

1 stars 0 forks source link

nominals/lexnorm-all2.txt #2

Open funderburkjim opened 6 years ago

funderburkjim commented 6 years ago

lexnorm-all2.txt contains data drawn from the mw digitization for records identified as nominals or indeclineables.

It begins as a copy of lexnorm-all2.txt.

This is a csv format file (separator the tab character). Fields:

L = cologne record id for a record in Cologne digitization of Monier-Williams Sanskrit-English dictionary
key1 = headword in citation form
key2 = headword with compound-marking using '-'.
lexnorm = inflection information in one of two forms:
- 1: LEXID=X,STEM=Y[...] for pronouns, cardinal numbers, etc.
- 2: m:f#X:n gender information, along with stem hints

funderburkjim commented 6 years ago

changes to lexnorm-all2.txt

We will use a program (stem-model.py) to derive words with similar models from lexnorm-all2.txt. This derivation will use the key2 and lexnorm fields. Sometimes, the information in these fields will be not quite right for this derivation purpose. Thus, we allow for modification of our local copy of lexnorm-all2 . We'll keep the editing changes in a separate file edit_lexnorm-all2.txt for reference and comments. Some of the changes will flow back into corrections to the mw.txt digitization.

funderburkjim commented 6 years ago

removals from lexnorm-all2

In addition to corrective kinds of changes, some records are removed into separate files. This is because their models require special handling, and identification of other models is simpler if these models are treated separately.

Currently, these extractions from lexnorm-all2 are:

lexnorm-all2-part.txt participles (about 400)
lexnorm-all2-proncpd.txt compounds with pronoun Bavat (4)
lexnorm-all2-inflectid.txt headwords of mw presented in dual or plural form. Thus stem must be determined specially.

gasyoun commented 6 years ago

lexnorm-all2-proncpd.txt compounds with pronoun Bavat (4)

What use?

funderburkjim commented 6 years ago

What use?

Bavat can be declined in two ways:

As present active participle of BU -- in this form the nominative singular (1s) is Bavan
As 'honorific' pronoun -- 1s in BavAn

And there are a few other differences (See Deshpande text, p. 191 for honorific pronoun discussion).

So we need to distinguish between words ending with Bavat as to how they should be declined.