UniversalDependencies / UD_Latvian-LVTB

Creative Commons Attribution Share Alike 4.0 International
2 stars 2 forks source link

Lemmas for feminine and masculine adjectives #3

Closed tomsbergmanis closed 5 years ago

tomsbergmanis commented 5 years ago

Dear colleagues, In the Latvian UDT, the feminine and masculine forms of the same adjective have separate lemmas. This makes the feminine and masculine forms of adjectives seemingly two unrelated words as they are not related to each other via the same lemma. As long as there might be a subtle linguistic motivation for such conceptualization, in my opinion, this only creates data sparsity and causes inconsistencies with other data sets.

Let me know if I can be of any help if you decide to implement the changes. T. Bergmanis Ph.D. candidate at University of Edinburgh

lauma commented 5 years ago

I'll ask around. There was some linguistic reasoning behind it a long time ago, but I'm not sure if it is relevant anymore.

tomsbergmanis commented 5 years ago

Thanks! I am asking because it would be nice to reach a consistency between Latvian UDT and UniMorph morphological standards. This seemed like an artifact that could be changed in UDT (while most of the work in my opinion lays on the UniMorph side).

lauma commented 5 years ago

Yes, for UDv2.3 adjectives and numerals will have the same lemma for both genders, with an exception for adjectival surnames like Anna Baltā where lemma at least for now will stay capitalized Baltā. I don't know, if it is the best solution regarding to surnames, but we need to do more thinking on that. Also, those cases are fairly rare: if remember correctly, I saw 3 such surnames on the whole corpus, two were some contemporary persons and one was Ivans Bargais (Ivan the Terrible).

lauma commented 5 years ago

We also speculated that there might be some adjectives that are used only in feminine gender and, thus, shuldn't have masculine lemma (Tezaurs.lv reports ālava as such), but it looks like currently there are no such cases in treebank.

tomsbergmanis commented 5 years ago

Cool, cool, cool! Thank you very much! Toms