Thought: The current state of Marcion data is imperfect. We will likely have
to introduce new types (e.g. articled vs. non-articled nouns) in order to
build an accurate inflection module. We might also have to populate the
derivations data differently.
Implement normalization of the remaining annotations, namely
-
for prenominal forms,=
for pronominal,+
for qualitative, and―
for same as above.Just carry them on a separate field in
lexical.structured_word
, just like you did with attestations.Implement normalization for English-within-Coptic.
Just carry them on a separate field in
lexical.structured_word
, just like you did with attestations.Control
constants.ACCEPTED_UNKNOWN_CHARACTERS*
. It should be possible to exercise more rigor once the extra normalization steps have been implemented.Detached types override / invalidate root types. Investigate.
Thought: The current state of Marcion data is imperfect. We will likely have to introduce new types (e.g. articled vs. non-articled nouns) in order to build an accurate inflection module. We might also have to populate the derivations data differently.