UniversalDependencies / UD_Irish-IDT

Irish data
Other
6 stars 7 forks source link

Review of morph features Verbal Adjectives #121

Closed tlynn747 closed 3 years ago

tlynn747 commented 3 years ago

Inconsistencies in features of adjectives that appear to be Verbal Adjectives - (issues appear mainly in predicted features section of training file)

2612 Tá roinnt dánta seanchumtha atá oiriúnach go maith curtha isteach freisin Case=Gen|Gender=Masc|Number=Plur

2661: Taobh amuigh den teach bhíodh oilithreachtaí ar siúl go dtí toibreacha beannaithe Bhríde (Gender=Masc|Number=Plur)

2725: agus a charraigeacha briste (NounType=NotSlender|Number=Plur)

2748: Agus chuala sé a scréach chráite nuair a maraíodh le buille claímh í
(Case=NomAcc|Form=Len|Gender=Fem|Number=Sing)

From Elaine's thesis (ie informing the original gold morph features): http://doras.dcu.ie/2349/1/PhD_Elaine_Final.pdf

'Verbal Adjectives' & 'Verbal Adjective vs. Verbal Noun Genitive Case Ambiguity' (lth. 247, Appendix C, lth. 8)

  1. If the head (modified) noun undergoes the action, the modifier is a verbal adjective e.g. 'Na Stáit Aontaithe'.
  2. If the head (modified) noun is the agent or facilitator of the action, the modifier is a verbal noun in the genitive case e.g. 'páirc imeartha'.
  3. If the modifying noun is clearly functioning as a common noun in genitival noun phrases, i.e. is preceded by a determiner an, the modifier is a verbal noun in the genitive case., i.e. 'lá an chláraithe'.
kscanne commented 3 years ago

I already did a manual review of the verbal adj. vs. genitive verbal noun distinction so those should be all set (was part of some earlier bug that I can't remember now).

Agree that a careful review of verbal adjective features is still needed (all ADJ, really). I can produce a report of places where the NOUN and ADJ features don't agree in cases of deprel==amod if that would help.

kscanne commented 3 years ago

See also #23

tlynn747 commented 3 years ago

OK great!

But just to be sure we're on the same page, the following (in my understanding) should be Verbal Adjectives with the features VerbForm=Part and attached as amod - and fall under Type 1 above:

2612 Tá roinnt dánta seanchumtha atá oiriúnach go maith curtha isteach freisin 2748: chuala sé a scréach chráite 1207: sna tíortha forbartha 2320: a mbíodh fuinneoga gloine daite ann

While I'lm checking them I'll also look out for candidate for the acl label as per issue #86 e.g. níor chonaic sé na haghaidheanna smeartha le snas roimhe.

kscanne commented 3 years ago

Yes, same page. I'll note there could be some debate with examples like "seanchumtha" since there's not really a verb "seanchum" that this is based on. Similarly with stuff like "dosheachanta" which looks sort of like a verbal adjective but really isn't (the one example of "dosheachanta" in the treebank has features Case=NomAcc|Gender=Fem|Number=Plur and not VerbForm=Part, fwiw)

tlynn747 commented 3 years ago

50% of training file reviewed (down to sent 3051). Most of the issues are in this file due to prediction of features. But will go over test and dev once this file is finished.

tlynn747 commented 3 years ago

full review of training file done.

Note however, that this was a review of features of all verbal adjectives -- not a review of all adjectives to find unmarried verbal adjectives!

Came across some instances that I changed to acl but it was difficult to do two tasks at once as I was checking every adjective... Will review the acl issue separately in issue #86