Closed alvelvis closed 12 months ago
There is relevant discussion in #751. Basically I think the appos
relation is mixing POS/phrase type criteria (has to be an NP) with grammatical function criteria (yours is an example of what CGEL would call a Supplement).
This is a case that in Latin IT-TB and LLCT we might treat by means of conj
, in particular the "explicative" subtype of conj
, conj:expl
(subrelation which has been introduced in these treebanks - documentation still has to follow).
The appos
relation in UD is part of the "nmod
family" in a sense, and is very specific in its usage, as you notice. Instead, in such cases of a looser relation, all the while adding new elements and/or expanding about something that has already been introduced and with the two elements being syntactically interchangeable, in my opinion the most fitting relation should be one of the "horizontal" kind, and I found the best and simplest one to be conj
. There are many types of co-ordinations, and the "id est co-ordination" is one of them (and consequently the possible introducing "id est" element has to be treated as a CCONJ
). Besides, from a very practical point of view I observed that this annotation was found intuitive and was well received by the annotators.
But actually, here a simple conj
seems to fit well enough to me. The part a behavior...
can be seen as an elliptical copula and this is what makes it parallel to the preceding clause: ...used to browse... [and this is] a behavior... I am not sure if this warrants an :expl
, but in the end I see a clear case of asyndetic co-ordination!
Nice example! This reminds me a little of discourse deixis in coreference, where you refer back to something that doesn't actually have an NP antecedent. I agree that avoiding appos here seems appropriate. I would have chosen parataxis
over conj
, just because there is no overt coordination marker.
I agree it is coreference-like. As I understand it CGEL argues that Supplements are anaphoric.
Another observation: "which is" can be inserted before the NP, making it resemble a nonrestrictive relative clause. Perhaps "who/which is" insertion should be a test for determining the deprel.
TBH I don't love conj
(you can't simply insert "and") or parataxis
(which would suggest it is something less than a fully integrated syntactic sentence) but also don't have a better suggestion.
@amir-zeldes Do you know what PTB would do here? Is there a function tag for NP supplements?
^ I found it quickly on pp. 12-13 of http://languagelog.ldc.upenn.edu/myl/PennTreebank1995.pdf: NP-ADV
Makes me wonder if UD should call this appos:adv
or similar.
In English-GUM, it seems that supplements such as "which is a problem" have been analyzed as acl:recl
depending on one of a noun in the previous clause, which is quite bad.
http://match.grew.fr/?corpus=UD_English-GUM@2.7&custom=6011172ecc8ce (see second example and also fourth example where it is ccomp
)
The equivalent construction in French-GSD ("ce qui est un problème") has been analyzed as a parataxis
depending on the root of the previous clause, which is much better I think.
http://match.grew.fr/?corpus=UD_French-GSD@2.7&custom=60111545c1385
My preference for conj
over parataxis
is that there still is some kind of cohesion, both syntactic and in meaning, between all the elements of the sentence, and the "supplement" is not really orthogonal or extraneous to the main clause. The classic UD example is Latin veni, vidi, vici 'I came, I saw, I won'.
I think there is somehow some stigma about asyndetic co-ordination, but it seems to me a very widespread phenomenon, also in languages sich as English or Latin which prefer an explicit conjunction. It's not really the same as syndetic one, so the insertion of a conjunction might not fully do justice to it, but I think it helps focusing it.
The conj
treatment is also (consequentially) the one we are using now for "clausal relatives" where the relative element is in the core (He said so and so, which was nice), again acknowledging more cohesion than a simple parataxis
(and keeping the annotation of the relative element). If it is not in the core, on the contrary, I think it might be appropriate to treat it as an advcl
(and the oblique relative element tends to be grammaticalised as a conjunction).
Wild suggestion: would it be thinkable to extend appos
to such interclausal relations? Or better, create a new clausal appos
relation?
In English-GUM, it seems that supplements such as "which is a problem" have been analyzed as acl:recl depending on one of a noun in the previous clause, which is quite bad.
@sylvainkahane I don't think this is the same construction - here we have an explicit relative clause which attaches to a verb, complete with a subordinate clause predicate and in this case a relative pronoun. I agree in the 4th example the tree isn't right, since the clause should come out of the verb, not the noun teachers. The canonical analysis in GUM is indeed:
And I think in this case both conj
and parataxis
would be wrong. In the construction being discussed here, it is not so clear that there is a predicate in the supplement, so it looks more like apposition (but I agree, that is probably not the best choice here - I still think parataxis is the lesser evil for those cases)
- maybe they had bad teachers, which is a problem
- acl:relcl(had,problem)
- nsubj(problem,which)
Once again, we run into issues with deprels mixing POS/phrase type and grammatical relation. I think it's fair to say this is indeed a relative clause, but it's an adverbial rather than adnominal relative clause. So should it be advcl:relcl
?
While some of the supplements arguably modify a nominal indirectly, and thus acl
could be argued, this one is pretty clearly modifying a verb or VP/clause.
My preference for
conj
overparataxis
is that there still is some kind of cohesion, both syntactic and in meaning, between all the elements of the sentence, and the "supplement" is not really orthogonal or extraneous to the main clause. The classic UD example is Latin veni, vidi, vici 'I came, I saw, I won'.
While I agree that this is different from the canonical use of the term "parataxis", parataxis
does apply to reported speech which doesn't follow the syntax typical of most verbs. I don't know if the speech verb is any more "extraneous" to the clause than the supplement in
I probably have a different intuition because in the similar construction (peut-être ils ont de mauvais profs, ce qui est un problème), the supplement is not exactly a relative clause. It is an NP with the pronoun "ce" 'that' modified by a relative clause. This construction in equivalent to free relatives in English (What you do is bad = Ce que tu fais est mal). I suspect the supplement _which is a problem_to be also a free relative.
https://en.wikipedia.org/wiki/Relative_clause defines a free relative clause as a relative clause lacking an overt external antecedent, as in "I like what I see".
For "maybe they had bad teachers, which is a problem", I would interpret "they had bad teachers" as the antecedent of "which", so a regular RC but not headed by a nominal.
If I understand the French correctly it is not a free relative clause because it requires an overt pronoun head to serve as antecedent, like "That which you do is bad" or "He who finishes first wins".
You're right, literally -ce que tu fais is equivalent to that which you do. But ce que and ce qui are very grammaticalized in French and have replaced pronouns in free relatives, as well in indirect interrogatives:
Je me demande ce qu'elle fait. 'I wonder what she's doing' Je sais ce qui va se passer. 'I know what will happen.'
So should it be
advcl:relcl?
@nschneid I would love to be able to find all of these cases using special labels, but realistically they are so rare that introducing them, or appos:adv
etc. would just create confusing, super-sparse labels for parsers to deal with (and make our confusion matrices even bigger than they already are). At the same time, it would make users interested in relative clauses miss these if they naively search for just acl:relcl
.
In general I think we need to avoid suggesting new deprels (including subtypes) for phenomena that are very rare. Cases like "VP which..." are easy enough to find by searching for verbs modified by acl:relcl
, so I'm happy with leaving that the way it is.
So does that mean we need to relax the definition of acl
to "clauses modifying nominals, and also all relative clauses"?
In essence I am asking whether deprel definitional criteria like "modifies a nominal" are strict, or whether deprels should be viewed as prototypes and rare special cases mapped to the most similar prototype.
My preference for
conj
overparataxis
is that there still is some kind of cohesion, both syntactic and in meaning, between all the elements of the sentence, and the "supplement" is not really orthogonal or extraneous to the main clause. The classic UD example is Latin veni, vidi, vici 'I came, I saw, I won'.While I agree that this is different from the canonical use of the term "parataxis",
parataxis
does apply to reported speech which doesn't follow the syntax typical of most verbs. I don't know if the speech verb is any more "extraneous" to the clause than the supplement in* They explained that elk in Yellowstone used to browse unmolested , a behavior that prevented the saplings from reaching mature stages
I am not fully understanding, because parataxis
does not apply in this example, since there is a ccomp
introduced by means of that and it is indirect speech [edit: sorry, I got a little bit confused here and didn't notice it was the original sentence].
The ratio of parataxis
for reported speech, in my understanding, is that the main clause is followed by other juxtaposed clauses which are independent both on a syntactic (no connective elements and they begin a new sentence with its own structure) and "semantic level" (they are kinds of quotation and as such do not need to have any bearing to the clause by which they are introduced). But this is really not the case for this "supplement". I mean: in something like
They bought meat, cooked a lot yesterday, great party
I do not see a parataxis
, but rather a conj
. I wonder if this is any different from the sentence in the original post. Whereas in
I went to the university yesterday - do you know Ann?
I see a parataxis
(the question comes out of the blue in the midst of another topic).
Ruling out acl
and appos
, if only for formal reasons, I do not see the reasons for advcl
either. The supplement is not "acting inside" the main sentence as an adverbial clause is expected to do, it is not the secondary predication of anything, it just adds something, it is outside of it: we have a sequence of clauses where one recalls the previous one by means of a relative element which has no real antecedent. Even if the suggested extension of acl
sounds interesting, I am not even anymore sure if this can be really called a relative clause, or if it rather has to be seen just as a possible strategy for interclausal cohesion. So, the most fitting relation between "peer sentences" would be conj
. Of course this deprel has the problem of obfuscating the nature of the co-ordinated elements. Is it something that can maybe be better specified with enhanced dependencies?
In essence I am asking whether deprel definitional criteria like "modifies a nominal" are strict, or whether deprels should be viewed as prototypes and rare special cases mapped to the most similar prototype.
I would be oriented towards a strict formal lecture, which I think is the intended one, if none else to guarantee annotational coherence between treebanks.
are special cases mapped to the most similar prototype
That is an excellent question - personally I've always understood categorical annotation to be about finding 'the best class from all the options'. If we adhere to acl as stricly limited to nouns and follow the guidelines to the letter, then I suppose we will be forced to choose dep
, used "when it is impossible to determine a more precise relation". However the guidelines also say "The use of dep should be avoided as much as possible"!
The advantage of our discussions is that we can think about what we want, and personally I find acl:relcl
to be the best option here, and more in keeping with the guidelines than anything other than dep
. A lot of annotation decisions we've made appeal to an analogy to more 'normal' alternatives, or ellipsis, etc. so I think in this case lumping a rare construction with it's most similar common counterpart is the most practical thing we can do.
They bought meat, cooked a lot yesterday, great party
I would call this conj
, especially since it doesn't look like you can insert "and" here:
The result would have to be an unlike-coordination (PTB's label UNC
), which is inevitable if we have 'and', but otherwise not necessary IMO.
More generally though, my understanding of the difference between parataxis and conj at the clausal level is the absence of an explicit cc
, so at least in the corpora I've worked on, two sentences standing next to each other with just a comma in between are always parataxis
. I've never understood "veni, vidi, vici" as conj
-- if it is sonj, then what would it take for two sentences next to each other to be parataxis?
If we adhere to acl as stricly limited to nouns and follow the guidelines to the letter, then I suppose we will be forced to choose
dep
, used "when it is impossible to determine a more precise relation". However the guidelines also say "The use of dep should be avoided as much as possible"!
Is there a clear reason that advcl
wouldn't apply? I agree that it counts as a relative clause, but if the universal relation criteria should take precedence over subtype criteria, then I don't see why it wouldn't be advcl
.
I think that would be surprising to most users - or at least to me :)
My point above was that users who want to find all relative clauses are likely to search for acl:relcl
, and would miss this if it were tagged differently. I'm also not 100% clear why advcl
applies - if there is no oblique or adverbial marker, then it technically isn't marked differently from an unmediated, core dependent, so wouldn't it be ccomp
?
PS - Please please don't take this to mean that I think it should be ccomp
- I'm just saying it makes for a rather poor example of advcl
from a form perspective, but makes a rather intuitive acl:relcl
in my opinion.
For noun-headed RCs an advantage of calling it acl:relcl
is it captures an alternation with gerundive/participial clauses:
acl
)acl:relcl
)So I'm wondering if there are NON-relative, clausal supplements anchored by verbs. How about:
What would be a good deprel for (1)? Not acl
, surely?
Interesting how this discussion connects with one example in Portuguese that I mentioned some time ago in https://github.com/ufal/udpipe/issues/128. The sentence is
«Amadeu Cury» nasceu em Guaxupé (MG) no dia 13 de maio de 1917, filho de Espir Cury e de Nazaré Cury. («Amadeu Cury» was born in Guaxupé (MG) on 13 May 1917, son of Espir Cury and Nazaré Cury)
We have discusses two interpretations. appos(son,Amadeu)
vs ellipse+conj. That is, the second clause has an ellipse he is son of...
and the copula was ellipsed. This reading consider two clauses connected by conj. For information extraction, this interpretation is much more indirect. One can't directly get the information that Amadeu is the son of Espir.
In the appos interpretation, we have a NP to NP dependency, right? Something like Amadeu born .. as a son of ...
. The way he borns, a little bit strange in Portuguese but...
@arademaker Yes I think that's like the example at the beginning of the thread. An NP supplement, hard to know how to attach it to the clause and whether it's appos
.
- Maybe they had bad teachers, posing a problem for their careers [non-RC]
This construction used to carry the special label vmod
in Stanford Dependencies, and was converted wholesale to advcl
, which I'm comfortable.. I think there's a kind of linguistics 'smell test', where I imagine what a garden variety linguistics student would say when you tell them "annotate this as advcl" - with the example in 1., I have no doubt they wouldn't blink and think it's fine.
I contrast that with things like oblique argument clauses (I'm relying [on you to come]), where every year I teach students to tag this as advcl I get some very cringey expressions and protests, since in general linguistics I think no one would call this an adverbial clause (despite being oblique and attaching to a verb). I think for the relative clauses attached to a verb, again normal linguists would be very surprised to consider them adverbial clauses, even if they attach to a verb.
For UDv3 how about we merge acl
and advcl
into modcl
? Then we can have modcl:relcl
(or even make relcl
a universal relation). :)
Why? I think the current expressivity of the acl
vs. advcl
is quite useful, so throwing away this distinction which is found in many corpora seems counterproductive. I also haven't heard annotators complaining that they can't tell them apart (at least not for English), nor are they commonly confused by parsers based on confusion matrices. Isn't it enough if we know that both are a type of clausal modifier?
What do they actually distinguish besides the POS of the head? Is it useful mainly for copular predicates since a modifier clause could be attaching to the clause vs. the adjective or noun?
If we merged them we wouldn't have this adverbial relative clause problem, for one.
Well, first of all I think there can be adverbial clauses attached to a noun just like regular adverbs can attach to nouns, especially if that noun is deverbal:
But more generally, I firmly believe we should avoid messing with the deprel list if we can at all help it. Every major change we make to the main relations (as opposed to localized decisions of which existing relation to use for a construction we don't have good guidelines for) is going to have lots of consequences:
In my opinion stable standards are really important and these are just some of the reasons why.
Yes I know your opinion on stable standards and I'm not sure there is a serious appetite for a UDv3. So it was only a half-serious suggestion.
But I think your example shows why the current deprels as defined in the guidelines are problematic. advcl
is defined as "An adverbial clause modifier is a clause which modifies a verb or other predicate (adjective, etc.), as a modifier not as a core complement." So by my reading "Kim's arrival exactly while I was drinking" would NOT meet that criterion because the clause is headed by a noun, and should be acl
. There is nothing in the acl
/advcl
definitions about the marking (while, etc.) of the embedded clause.
BTW this is related to the asterisk on advmod
in https://universaldependencies.org/u/dep/index.html. advmod
is more liberal than the other relations about what kinds of things it can attach to.
I don't think that POS of the head should be an overriding consideration, so I guess I would be in favor of amending the definition. For me, a clause that fulfills the function of an adverb (substitutable/pronominalizable/interrogable by an adverb) is an advcl
.
If the only important thing is the POS of the head, then should we be calling "then" in the example above nmod
? I think that from a traditional linguistics perspective that would be strange, and it raises the question of why we bother to distinguish function labels at all if we are just encoding the POS tags of the parent and child nodes? I think of deprels as primarily designating syntactic functions, rather than morphological classes.
If the only important thing is the POS of the head, then should we be calling "then" in the example above
nmod
?
No, by the definition of advmod
it is less sensitive to the POS of the head than other deprels.
I think of deprels as primarily designating syntactic functions, rather than morphological classes.
I think this is a huge tension in UD: it is lexicocentric and thus views syntactic functions as defined by POS classes. But this leads to problematic corner cases such as relative clauses which modify verbs, and are thus not really adjectival relative clauses, or adverbial-looking clauses which modify deverbal nouns. English, at least, seems to distinguish phrase types from grammatical functions, and when UD tries to convey both in a limited set of deprels, things get messy.
I have a hunch that UD could, in theory, be revamped to make less-POS-sensitive deprel distinctions (is this what Bill Croft has suggested?). But I'm not holding my breath for this to actually happen. :)
The morphological classes that tell us that the thing being modified is a noun/verb are already distinguished in the POS tags, so shouldn't we want deprels to mainly reflect function?
I'm actually in favor of modifying the universal definitions to reflect functional motivation for some of these labels, at least as an option, but as far as what we decide to do for unusual constructions like relative clause dependent of verbs in English, I think this is still a language-specific guideline decision and doesn't necessarily mean we have to change all of UD as a result.
I guess technically it's not whether the head is verb vs. noun, it's predicate vs. (non-predicate) nominal. https://universaldependencies.org/u/dep/index.html
Still, I'm not sure there is a sharp distinction between non-core modifiers of predicates vs. modifiers of non-predicate nominals when the modifier is a clause. At least not in English, when we consider that relative clauses can modify verbs and that deverbal non-predicate nominals can have adverbial-ish clause modifiers.
Two remarks First, @nschneid @amir-zeldes you started a new discussion about acl and advcl. Just a word about this. In SUD we decided to merge all modifier relations: amod, advmod, acl, advcl in one mod relation. We consider that the POS of the governor and the dependent is sufficient to recover the UD relation.
Second, I don't think that supplements are modifiers. For instance, they cannot be moved in front of the sentence, while in most langauges, sentence/verb modifiers can. They are almost new sentences, but they are less autonomous than true sentences. They can be NPs (initial example of this thread) as well as subordinated clauses (relative or participial clauses as in the example given by @amir-zeldes). Maybe we need a new relation for supplements, which will complete our set of relations for detached elements (I mean dislocatd, discourse and parataxis). In the French tradition, we distinguish what is called microsyntax for arguments and modifiers and macrosyntax for looser elements, which escape to the verbal construction. Clearly, supplements fall in the second category and are macrosyntactic elements. The best existing relation, for the time being, would be parataxis, but parataxis is already a relation covering many different thing (unmarked coordination, parentheses, inserts, clausal discourse markers, clausal dislocation) and the supplement is not always a clause. That's why I think that a new relation is required.
@sylvainkahane at least for the current English guidelines, the POS of the parent and child alone are not enough to reconstruct the relation: verbs can be amod
(a stunning picture), acl
(a dish to die for), advcl
(his arrival while I drank coffee), or acl:relcl
(the problem that this caused). In all of these, the parent is tagged NOUN and the child VERB...
I don't think that supplements are modifiers. For instance, they cannot be moved in front of the sentence
This is a difference between English and French, right? In English, non-RC supplements can be at the beginning of the sentence ("A die-hard conservative, her father refused." in #751).
But yes, I could see an argument that supplements are at a looser level of attachment than ordinary modifiers.
@amir-zeldes First, to annotate stunning as a VERB and an amod
is not in accordance with the guide: "An adjectival modifier of a noun (or pronoun) is any adjectival phrase that serves to modify the noun (or pronoun)."
Second, I don't see for what purpose or application it would be relevant to distinguish stunning as an amod
or stunning as an acl
. While it seems useful to distinguish modifiers and supplements.
verbs can be amod (a stunning picture), acl (a dish to die for), advcl (his arrival while I drank coffee), or acl:relcl (the problem that this caused). In all of these, the parent is tagged NOUN and the child VERB...
In EWT, annotation of gerunds are not consistent. Some are VERB, other ADJ. I reported it elsewhere.
Second, I don't see for what purpose or application it would be relevant to distinguish stunning as an
amod
or stunning as anacl
. While it seems useful to distinguish modifiers and supplements.
Well adjective modifiers in English are usually prenominal, whereas clausal modifiers are postnominal. So there is a phrase order justification to distinguish between amod
and clausal modifiers in English (whereas acl
and advcl
lack such a distinction).
If amod
and acl
were collapsed you could lose a bit of information with postnominal adjectives like "else"—how to tell whether it is an adjective modifier, or the predicate of a modifier clause? But perhaps the POS tag would suffice for most cases, if the corpora were consistent about treating prenominal adjectival modifiers as ADJ
.
I don't think that POS of the head should be an overriding consideration, so I guess I would be in favor of amending the definition. For me, a clause that fulfills the function of an adverb (substitutable/pronominalizable/interrogable by an adverb) is an
advcl
.
Would expanding advcl
to include things like "arrival while I was drinking" create a slippery slope for other deprels—e.g. "arrival on Friday" as obl
instead of nmod
? I don't want annotators to have to think too hard for such cases; if we must distinguish acl
and advcl
, "Is it attaching to a noun or a predicate?" is the most straightforward criterion I can think of.
Mm.. I see what you're saying, and with obl/nmod it seems more reasonable to me, since those are indeed primarily distinguished by the parent POS (SD didn't even distinguish them at all, calling both prep
). So I think we can avoid a slippery slope by restricting those as we've been doing. For advcl
I feel we are failing linguistically if we don't capture the parallelism to advmod
. Just like we have nominal obj vs. clausal ccomp and nsubj<>csubj, I feel that advcl is the clausal equivalent of advmod. If a noun can have an advmod, then I think its clausal equivalent should be advcl, so "my arrival just then" should parallel "my arrival just when ...".
But looking up, I see we've veered off topic here :) if there's more need for discussion feel free to open an issue - otherwise in my mind there's no need to change anything about these cases in corpora like EWT or GUM, so no issue is needed from my end.
Hi, Extending #523 discussion, I would like to know how we should annotate nominal phrases that are dependent of verbs, such as:
Here, the behavior in focus is that of "browsing", so we would interpret the NP as the apposition of a clause, but appos is a relation that should only be dependent of NP, not verbs.
Thanks in advance