In general, GUM and EWT contain Number features for verbs/auxes to reflect agreement with the subject. I assumed the presence of Number would be deterministic based on the XPOS, but this is not the case:
GUM counts: present for a handful of VB tokens, all VB{P,D,Z} tokens, no VBN/VBG tokens, and a large minority of MD tokens
EWT counts: present for a handful of VB tokens, all VB{P,D,Z} tokens, no VBN/VBG tokens, and no MD tokens
However, it appears that GUM is inconsistent on modal auxes, assigning Number to a few hundred of them. Why is this?
All modal auxes have VerbForm=Fin in both corpora.
Modal auxes never inflect for number agreement, but other verbs/auxes do to an extent (depending on tense and person). So that could be a reason to go with the EWT policy of not specifying the feature on modal auxes. But one might object that VBDs other than be do not inflect for number either, and in any case apart from the XPOS there is no explicit category for modal auxes (as opposed to non-modal auxes), so why should it control the presence of Number? A policy that all finite verbs and auxes receive Number based on the subject would be simpler to explain.
In general, GUM and EWT contain
Number
features for verbs/auxes to reflect agreement with the subject. I assumed the presence ofNumber
would be deterministic based on the XPOS, but this is not the case:The handful of VB ones with the feature are subjunctives (
Mood=Sub
): https://universal.grew.fr/?custom=6567f9f727bd2 This is presumably because they are finite, unlike most VB tokens.However, it appears that GUM is inconsistent on modal auxes, assigning
Number
to a few hundred of them. Why is this?All modal auxes have
VerbForm=Fin
in both corpora.Modal auxes never inflect for number agreement, but other verbs/auxes do to an extent (depending on tense and person). So that could be a reason to go with the EWT policy of not specifying the feature on modal auxes. But one might object that VBDs other than be do not inflect for number either, and in any case apart from the XPOS there is no explicit category for modal auxes (as opposed to non-modal auxes), so why should it control the presence of
Number
? A policy that all finite verbs and auxes receiveNumber
based on the subject would be simpler to explain.