UniversalDependencies / docs

Universal Dependencies online documentation
http://universaldependencies.org/
Apache License 2.0
272 stars 247 forks source link

Guidelines for Aspect feature & English #978

Closed nschneid closed 6 months ago

nschneid commented 1 year ago

I would like to consider the Aspect guidelines and whether/how they should be adopted for English.

Currently we are not using the feature in EWT or GUM. To categorize the verb form in isolation it is sufficient to use Tense and VerbForm. But perhaps we should add Aspect to reflect constructions within the verb complex like the perfect and progressive. (We have already decided to use Voice=Pass for the passive construction, which similarly operates as a combination of auxes and verb form. Because auxes are functional elements supporting a predicate I think it makes sense to view them as influencing its features.)

If we were to go that route, should we use Aspect across the board? I'm not sure it makes sense for simple past/present/future. The explanation for Hab (habitual) actually says it applies to simple present:

English simple present has this aspect. Examples

[en] he attends classes of Japanese

But this seems overly semantic to me: e.g. it is odd to apply the term "habitual" if the verb is stative ("the church stands next to the cemetery"). Is there an example from another language where the habitual is more directly encoded?

Finally, there are other periphrastic constructions that have the effect of modifying aspect or other predicate properties but not just through auxes: "used to" (habitual), "be going to" (futurate), "be about to", "be supposed to", etc. Since the semantically main predicate is syntactically an xcomp dependent, I suspect we should stay away from representing these constructions with morphological features.

amir-zeldes commented 1 year ago

I think this has been discussed a bit in the past, and the consensus was that FEATS is generally for word-wise features, so periphrastic aspect doesn't belong there.

But in terms of what makes sense to annotate for English, that would mainly be morphosyntactic tense IMO, since it's possible t know whether a clause is present perfect or present simple based purely on form, whereas distinctions like habitual require semantic disambiguation, as you pointed out. GUM has tense annotations (automatically derived) per RST EDU, so basically at the clause level, but they only encode the overt tense kinds, like "past progressive" & co.

nschneid commented 1 year ago

In that case maybe I'll put the perfect and progressive constructions in MISC.

Can we change the Hab example to one from another language? Languages that use it:

Akuntsu, Bororo, Breton, Buryat, Erzya, Irish, Kazakh, Lithuanian, Madi, Makurap, Marathi, Moksha, Nheengatu, Old Irish, Tagalog, Turkish, Turkish German, Uyghur, Wolof, Yupik

dan-zeman commented 1 year ago

For me, aspect in English is an example of the phrase-level features that we discussed in Dagstuhl and that should probably be represented somehow in MISC. (But the same would apply for English voice, for which you have apparently already decided to go the other way.)

Agreed that the documentation of Aspect should show one of the languages where the feature is actually used. However, I would not delete the English example completely because that is what many people will understand. So instead of saying only "English simple present has this aspect.", I would say something like "English simple present corresponds to this aspect; however, the feature is not used in English data because English aspect depends on presence/absence of certain verb forms and auxiliaries and it cannot be attributed to a single word in isolation."

nschneid commented 1 year ago

I don't think I would even refer to the "English simple present" as a whole—I don't think it is standard to refer to the simple present as a habitual construction (its meaning varies). What about including some other languages with English glosses?

nschneid commented 6 months ago

I found a random Irish-IDT sentence with Aspect=Hab and will put that in place of the English one unless someone has a better suggestion: "Is gnách go mbíonn teocht ard iontu."

nschneid commented 6 months ago

Actually, I found Irish docs, will copy those instead.