UniversalDependencies / UD_Irish-IDT

Irish data
Other
6 stars 7 forks source link

Consider renaming NomAcc case #71

Closed tlynn747 closed 2 years ago

tlynn747 commented 3 years ago

Following yesterday's discussion about case features reflecting form vs syntactic role, the common case issue raised its head again!

In Modern Irish the common case is a common form that is used in nouns that are functioning in the Nominative Case, Accusative case and most Dative cases (ie in prepositional phrases). This is a remnant of moving away from Old/Middle Irish where nouns inflected differently for each of these cases. A small number of nouns (e.g. Éireann -> in Éirinn, teach -> tigh) have Dative inflection in Modern Irish.

In the first conversion of the Irish data to UD, we used Case=Com to cover this common case form. However, it turns out that Case=Com is used by other languages for the Comitative case. So we decided to revert to Case=NomAcc instead.

However, the fact that many nouns in a Dative role (ie objects of prepositions) take this case form too, the feature value name can cause confusion.

@dan-zeman can we change the value of Case=NomAcc to Case=Common?

colinbatchelor commented 3 years ago

Just to concur - the most recent grammar of Scottish Gaelic (http://www.clanntuirc.co.uk/GGGE.html) identifies the cases as gairmeach (voc), tabhartach (dat), ginideach (gen) and bunasach ('fundamental', 'basic') and the situation with prepositions is very similar, so it would be useful to have Case=Common (or whatever is decided) for gd too.

dan-zeman commented 3 years ago

@dan-zeman can we change the value of Case=NomAcc to Case=Common?

I would probably go just with Case=Nom (explaining in the documentation that it has wider usage than the nominative in languages that have a distinct accusative form).

But if you insist on using a different label, then you can of course define and use it (make sure that it is documented and permitted for the required UPOS tags here). At present, the Irish treebank does not pass validation.

tlynn747 commented 3 years ago

Further discussion needed with other Celtic treebankers. To address for v2.9

tlynn747 commented 2 years ago

Any others in favour of Case=Common instead of reverting all to Case=Nom? @colinbatchelor I take it you are!

@jheinecke and @ftyers - does this apply to Welsh/Breton?

cc @kscanne

jheinecke commented 2 years ago

There is no grammatical case on nouns left over in Modern Welsh and Breton. The soft mutation of indefinite nouns in object position is not really a case (many linguists dealing with Welsh share this view). Pronouns have two series (in Welsh traditionally called independent and dependent pronouns, but this is not case neither. Currently the Welsh treebank does not use the Case feature at all, in Breton it is used with pronouns

ftyers commented 2 years ago

I'm happy either way. I think that Case=Nom with a description in the documentation as @dan-zeman proposes is fine. Alternatively we could follow Romanian where Nom/Acc are completely syncretic and it seems that they use Case=Acc,Nom (these can be disambiguated in the some of the pronouns).

As an aside, in the Turkic languages we use Case=Nom for the unmarked case (which can be Nom/Acc/Gen) and Case=Acc for the marked case. I don't think it's a big issue to have the case as just Nom.

In Breton it is only used to mark the subject/object forms of the pronouns.

colinbatchelor commented 2 years ago

Having reread the UD guidelines I agree with @dan-zeman and @ftyers.

tlynn747 commented 2 years ago

Done :)