UniversalDependencies / UD_English-EWT

English data
Creative Commons Attribution Share Alike 4.0 International
197 stars 41 forks source link

Singular subject + "were" subjunctives #511

Closed nschneid closed 2 months ago

nschneid commented 4 months ago

In annotating Mood=Sub for subjunctives (#194) we didn't consider cases like "If I were a rich man".

Is "were" the only verb that works like this?

amir-zeldes commented 4 months ago

Is "were" the only verb that works like this?

Jein. It's the only verb where the form would be distinct, due to Verner's law (English doesn't keep the extra vowel of the subjunctive ending and the consonant difference is due to the stress pattern). Other verbs also do this ("If I did I'd be a millionaire"), but I'm guessing we don't want to annotate those.

Looks like for GUM we could get away with conditioning it on being a sibling of "if", but in theory it could cause false positives or cases that are not distinguishable (e.g. committee noun + if ... were)

nschneid commented 4 months ago

OK let's just annotate "were" because it's usually clear whether it's subjunctive or past tense.

nschneid commented 4 months ago

TBC, the options for "were" as AUX, with * as wildcard:

Past tense: plural or 2nd person

(plural agreement includes committee nouns and subjects with and-coordination)

were    be  AUX VBD Mood=Ind|Number=Plur|Person=*|Tense=Past|VerbForm=Fin
were    be  AUX VBD Mood=Ind|Number=Sing|Person=2|Tense=Past|VerbForm=Fin

Subjunctive: singular and not 2nd person

were    be  AUX VBD Mood=Sub|Number=Sing|Person=1|Tense=Past|VerbForm=Fin
were    be  AUX VBD Mood=Sub|Number=Sing|Person=3|Tense=Past|VerbForm=Fin
AngledLuffa commented 4 months ago

So now should we go through the treebanks labeling incorrect usages of "was"?

It always annoys me when I hear lyrics such as "Just another manic Monday. I wish it was Sunday..."

nschneid commented 4 months ago

I think we should be descriptivist about this one. :) It's clearly morphosyntactically subjunctive if it's "were" + singular + 1st or 3rd person; otherwise it's hard to tell.

nschneid commented 4 months ago

As an English speaker who does not know historical linguistics I have staggeringly little intuition about tense with subjunctives. :) @amir-zeldes I take it if we are saying "be" is a present subjunctive then "were" is a past subjunctive?

(CGEL p. 87 actually argues against calling "were" subjunctive at all, opting for irrealis, and says that neither "were" nor subjunctive "be" actually has a tense; but I think we should just go with the traditional terminology for UD.)

amir-zeldes commented 4 months ago

Yes, and it sounds like we want to take the position that syncretic cases like "you were" and plurals are seen as Ind, right? If so, and we want to require 'if' for safety, here's how I'll implement it for GUM:

text=/.*/;func=/nsubj/&xpos=/NNP?S/ #1>#2   #1:storage=not_subjv_parent
text=/.*/;func=/nsubj/&lemma=/you|they|we/  #1>#2   #1:storage=not_subjv_parent
text=/.*/&storage!=/not_subjv_parent/;lemma=/if/;text=/were/    #1>#2;#1>#3 #3:morph+=Mood=Sub;#3:morph+=Tense=Past
nschneid commented 4 months ago

There was a token with "whether" instead of "if", so maybe add that to the rule. (And I can imagine other things like "Were he to leave", but those will be rare.)

BTW here is the neaten validator update that includes this: 0ff690c

amir-zeldes commented 3 months ago

"whether" is not attested with "were" in that way for GUM, I guess you mean EWT? I added it with just "if" for now. Thanks for adding to neaten!