Open funderburkjim opened 6 years ago
We will use a program (stem-model.py) to derive words with similar models from lexnorm-all2.txt. This derivation will use the key2 and lexnorm fields. Sometimes, the information in these fields will be not quite right for this derivation purpose. Thus, we allow for modification of our local copy of lexnorm-all2 . We'll keep the editing changes in a separate file edit_lexnorm-all2.txt for reference and comments. Some of the changes will flow back into corrections to the mw.txt digitization.
In addition to corrective kinds of changes, some records are removed into separate files. This is because their models require special handling, and identification of other models is simpler if these models are treated separately.
Currently, these extractions from lexnorm-all2 are:
lexnorm-all2-proncpd.txt compounds with pronoun Bavat (4)
What use?
What use?
Bavat can be declined in two ways:
And there are a few other differences (See Deshpande text, p. 191 for honorific pronoun discussion).
So we need to distinguish between words ending with Bavat as to how they should be declined.
lexnorm-all2.txt contains data drawn from the mw digitization for records identified as nominals or indeclineables.
It begins as a copy of lexnorm-all2.txt.
This is a csv format file (separator the tab character). Fields: