UniversalDependencies / UD_English-EWT

English data
Creative Commons Attribution Share Alike 4.0 International
197 stars 41 forks source link

Number feature for verbs esp. modal AUXes #482

Closed nschneid closed 2 months ago

nschneid commented 7 months ago

In general, GUM and EWT contain Number features for verbs/auxes to reflect agreement with the subject. I assumed the presence of Number would be deterministic based on the XPOS, but this is not the case:

The handful of VB ones with the feature are subjunctives (Mood=Sub): https://universal.grew.fr/?custom=6567f9f727bd2 This is presumably because they are finite, unlike most VB tokens.

However, it appears that GUM is inconsistent on modal auxes, assigning Number to a few hundred of them. Why is this?

All modal auxes have VerbForm=Fin in both corpora.

Modal auxes never inflect for number agreement, but other verbs/auxes do to an extent (depending on tense and person). So that could be a reason to go with the EWT policy of not specifying the feature on modal auxes. But one might object that VBDs other than be do not inflect for number either, and in any case apart from the XPOS there is no explicit category for modal auxes (as opposed to non-modal auxes), so why should it control the presence of Number? A policy that all finite verbs and auxes receive Number based on the subject would be simpler to explain.

amir-zeldes commented 7 months ago

Sounds like a bug in GUM, I'll investigate, thanks

nschneid commented 2 months ago

@amir-zeldes In GUM I still see 6 MDs with Number

amir-zeldes commented 2 months ago

Will fix, thanks!