UniversalDependencies / docs

Universal Dependencies online documentation
http://universaldependencies.org/
Apache License 2.0
274 stars 249 forks source link

Tags use #952

Closed Andres-Chandia closed 1 year ago

Andres-Chandia commented 1 year ago

Hi there, I'm starting just now in this topic, I would like to give standard morphological annotation to Mapudüngun morphological analyses. Sorry if my question is too naive, but I can not figure out the proper way to tag, for instance, a transitive/intransitive verb, until now I'm using -IV and -TV... I use a nomenclature like this: -IV.elu_give+PASS.nge23+CF.ke14+IPD.fu8+IND.y4+1.Ø3+PL.iñ2 As Mapudüngun is an agglutinative language, in the glosses I represent... in the case of the root: -PoS.root_meaning, in the case of the suffixes: +SUFFIXTAG.formSLOT, being the slot where the suffix is located identified by the number at the end of the suffix tag. Thanks in advance...

dan-zeman commented 1 year ago

There is a language-specific feature Subcat that some languages use to express transitivity: a transitive verb gets Subcat=Trans, an intransitive verb gets Subcat=Intr.

You can represent the morphemic segmentation and corresponding glosses in the MISC column in the optional attributes MSeg and MGloss.

The main annotation in the UPOS and FEATS columns refers to the whole word and not to individual morphemes. Hence for your example you will probably have VERB in UPOS (if I correctly understand that the whole thing is a verb) and in features Mood=Ind|Number=Plur|Person=1|Subcat=Intr|Voice=Pass (and maybe something to represent your CF and IPD, I have no idea what those glosses mean).

Andres-Chandia commented 1 year ago

Ok, so it is juts to represent in a corpus... There is some way to do representation for analysis glosses, in the way I put it in the previous example? Something that does not make the glosses too intricate?

dan-zeman commented 1 year ago

The MSeg and MGloss attributes in the MISC column are for glosses. But UD does not attempt to standardize the tags used in the glosses; you would have to adopt some other standard for that. UniMorph could be useful.

Andres-Chandia commented 1 year ago

Thanks a lot Dan, I will dig into that...