UniversalDependencies / docs

Universal Dependencies online documentation
http://universaldependencies.org/
Apache License 2.0
267 stars 245 forks source link

:tmod in English #893

Closed nschneid closed 1 year ago

nschneid commented 1 year ago

I have had some trouble interpreting the {obl,npmod}:tmod as currently implemented in EWT. If :tmod is part of the deprel, I would have assumed it was to distinguish certain constructions that have special syntax because they are temporal (such as prepositionless oblique nominal modifiers), basically as a temporal counterpart to the :npmod relations. However, this is not a universal opinion:

Originally posted by @dan-zeman in https://github.com/UniversalDependencies/docs/issues/508#issuecomment-345440516:

I don't see why preposition should make a difference. If you see a pattern in data, it does not necessarily mean that the pattern is a rule. It is also possible that there is an annotation error.

The English documentation has not been updated to UD v2 and the relevant page was named nmod-tmod until now. I just renamed the page and replaced nmod with obl. The second example clearly shows that preposition can occur in obl:tmod: http://universaldependencies.org/en/dep/obl-tmod.html

The guidelines quoted above make it sound like purely a semantic class distinction. Why, then, does it belong in a deprel?

There are in-between cases like measure phrases which appear in the :npmod docs but can be temporal ("5 miles longer/away" vs. "5 years later/ago", "10 dollars an hour" vs. "once a month").

Section 7 of the Mischievous Nominal Constructions paper discussed :npmod and :tmod and argued that both should be limited to adverbial (prepositionless) nominals. A different approach that might be simpler to interpret would be to limit :tmod to certain temporally specific constructions like dates.

amir-zeldes commented 1 year ago

It would be a little sad to throw out :tmod since it alreadt exists in the English TBs, and even some other languages. But in terms of what it currently means, my understanding matches the first one above, and the mischievous paper, i.e. that it can only be used if there is no preposition, like :npmod, and I'm pretty sure that's how it behaves in GUM.

dan-zeman commented 1 year ago

it can only be used if there is no preposition

I don't know if this is the way how it's understood by the peopole who use :tmod in other languages. I understand the objection that a purely semantically defined label should not be used for a syntactic relation, but maybe it's useful for applications and easy to be applied by annotators? I can see some value in a subtype that reveals the semantics of the modifier. I see no value in a subtype that tells me that there is no preposition, as that is something I know already.

nschneid commented 1 year ago

If we want semantics wherever it might be useful for applications, we should have semantic roles. :) But I don't see that as the purpose of deprels.

In English we have a policy that obl and nmod dependents are normally PPs, and need to be subtyped otherwise (as possessive :poss or non-case-marked :npmod/:tmod).

amir-zeldes commented 1 year ago

In English we have a policy that obl and nmod dependents are normally PPs, and need to be subtyped otherwise (as possessive :poss or non-case-marked :npmod/:tmod).

Exactly - and I do see value in this, since it makes it possible to easily find PPs, which are definitely a syntactic class. The temporal thing is less justified as a syntactic label and somewhat redundant with entity types, which is a non-syntactic annotation. But just for stability and since it doesn't really hurt, I don't mind keeping the tmod subtype.

dan-zeman commented 1 year ago

makes it possible to easily find PPs

Can't you easily find them looking for the case relation?

nschneid commented 1 year ago

There is, for instance, preposition stranding where the preposition gets promoted to nmod or obl (and no case).

nschneid commented 1 year ago

In Croft's terms (as I understand them), the overarching category nmod describes constructions at the level of information packaging (nominal modification) while the subtypes indicate strategies—nmod:poss for a genitive strategy, nmod:npmod/:tmod for a zero-coded strategy. In principle we could add nmod:pp to indicate the adpositional strategy, which is far and away the most prevalent in English. But from a practical perspective it's easier not to subtype those.

Note that the presence of a preposition is also a key criterion in determining the coreness of clausal dependents in English (any preposition-marked nominal is automatically obl; we don't treat to-PPs as iobj, for instance).

amir-zeldes commented 1 year ago

Agreed @nschneid . Basically the shift in English subtype deprels is an indirect result of the prohibition on advmod with non-ADV IMO - if I had to merge obl:npmod/tmod with something, I would much rather see them merged with advmod than with obl/nmod, which are the canonical way to tag PPs in English. If we didn't have a clear label for PP heads I would see it as a big caveat to using UD for English.

Stormur commented 1 year ago

Just to give an external perspective, some time ago we have begun to introduce the tmod/lmod (locative) subtypes for advmod and obl in our latest Latin treebanks. The annotation is as yet, unfortunately, rather patchy, and some problems have arisen: the first is when to distinguish possible metaphorical readings (like spatial analogies for time) and which tag to choose, and the second one is the "concurrence" with obl:arg for e.g. verbs of motion (as for this latter, I am inclined towards the prevalence of the more specific tmod/lmod over arg). Anyway, the presence or not of prepositions has no role in our choices.

I think that, judging from the current state of things, there is no problem in having a semantic subtype: this is what tmod/lmod is alongside the established arg, the recently mentioned cmp (for comparisons), and others still, from some point of view maybe even relcl. So, once you have the syntax in the main dependency, there can be room for semantic specifications which are otherwise not easily retrievable... no?

I also agree with @dan-zeman in not fully understanding what is the use of tagging the head of a prepositional phrase, which is tautologically such for already having a case dependent. Maybe I am missing the issues represented by stranding (I would like to see some problematic examples to understand!).

colinbatchelor commented 1 year ago

that it can only be used if there is no preposition, like :npmod

That's exactly how I've been using it in Scottish Gaelic but I think Irish does it semantically.

amir-zeldes commented 1 year ago

that it can only be used if there is no preposition, like :npmod

Yup, the latest UD Hebrew corpus also does it this way (only when no preposition is there), so I think there are probably a few other corpora like that.