Closed nschneid closed 12 months ago
I can make these cases consistent, but it looks like we both agree the correct deprel is advmod
, which leads to the question of why tag them as ADJ
and not ADV
? If something is "third biggest", then that describes in "in what way is it big" or "how big is it?", which for me means it's an adverb (interrogable by "how", so manner or extent in this case).
Can we move to amend this guideline? I'm happy to make them all ADV + advmod.
It's a case of productive extension—any ordinal number can be used in this construction; does that make it zero-derivation of ADV? I don't necessarily have a strong opinion but before changing the guideline we would need to hear why it was written that way.
Oh also this construction doesn't just modify adjectives:
So this would be amod(apples, third)?
Ever since the UD validator has put such an emphasis on equating advmod with ADV, it's been my understanding that adverbially used (morphological) adjectives should also be tagged ADV. This seems especially straightforward for English, since many morphologically unmarked items are regularly ADVs ("do something quick/ADV"), so I don't see the motivation for ADJ here in particular (it's not like we don't assume zero derivation for things like doing something quick, nice, fast etc.)
For "third most apples" I would have done advmod(most, third). I think the amod reading off the noun would mean something like there being three "most apples" instances, of which Sam is the third. So something like "Sam has (won the) third (iteration of the) most apples (award)". If it limits the scope of it being "most" (not absolutely most but third most), then it should be a child of "most".
Either way I'm curious what @dan-zeman and others think about this.
The validator will not complain if it encounters an ADJ
attached as advmod
. The validator mainly wants to avoid NOUN
+advmod
(because nouns should be obl
instead), and VERB
+advmod
(because those should be advcl
instead).
If I understand correctly what the construction is supposed to mean, then I think that third should be attached to most and not to apples. Then advmod
is probably more expected than amod
, although I don't feel strongly about it. But I wouldn't change the tag of third from ADJ
to ADV
just because it occurs in such a construction.
I wouldn't change the tag of third from ADJ to ADV just because it occurs in such a construction.
We're agreed on the attachment, but this part surprised me - if functioning as an adverb (advmod
) is separate from being morphologically an adverb (ADV
), then why not accept NOUN+advmod too? The reason we don't attach these as just obl
is that they are unmediated (look like objects in "I ran three hours"), so as a compromise we have subtypes like :npmod
, :tmod
etc., inherited from Stanford Dependencies. But if being adverbial is just a function, we could have tagged them as advmod with non-ADV pos as well, so this seems inconsistent.
Would you also tag the following as adjectives?
I wouldn't change the tag of third from ADJ to ADV just because it occurs in such a construction.
We're agreed on the attachment, but this part surprised me - if functioning as an adverb (
advmod
) is separate from being morphologically an adverb (ADV
), then why not accept NOUN+advmod too? ... But if being adverbial is just a function, ...
Because nominals and modifier words are different categories in the top-level UD taxonomy. Adjectives and adverbs are both modifier words, so I see at least some room for debate. But nouns are nominals, hence no advmod
is allowed for them. Think of obl
as the label for "being adverbial" that is used with nominals.
Would you also tag the following as adjectives?
Maybe... or maybe not. It depends on how you want to define adverbs in English. That has been a mystery to me ever since I learned that the -ly suffix is not obligatory.
I see no need to reinvent the wheel on English ADJ vs. ADV. If a word like "cheap" or "long" could be replaced by "carefully" but not "careful", it should be ADV.
Regarding ordinal numbers, PTB says always ADJ, so it seems easiest to stick with that:
This construction is special, which is why they needed to mention it (and we should document it), but I think advmod(largest/ADJ, fourth/ADJ) is an acceptable option.
it should be ADV.
+1 !
Regarding ordinal numbers, PTB says always ADJ, so it seems easiest to stick with that
The first part of that image is curious and not in line with the data (see below), but I think you're misreading the second guideline: it says "compounds of the form fourth-largest", but you need to keep in mind that these were not tokenized apart in the original PTB, so they are just saying the whole thing (headed by "largest") is an adjective. If you look at OntoNotes, which contains the re-tokenized PTB and which I take to be the successor of PTB, you will see that a majority of cases tags the modifier as RB (admittedly it's 26:17, so not a huge majority), including in WSJ:
And similarly in the newer genres added by ON:
I think the "substitution by -ly" test suggests that things like sentence initial ordinals ("First, ..." = "Firstly", "Second" = "Secondly") should be tagged as ADV as well, and again ON backs this up:
A query for "First ," and "Second ," shows the skew here is much stronger, with 91:13 in favor of RB (plus 5 cases of LS, oddly, even though it's spelled out as a word!). I don't think ordinals should be given a unique analysis when they fit the same normal ADV distribution tests as regular adverbs, and though I might have agreed if there was a huge precedent for doing this for consistency reasons, it seems ON doesn't do it either.
Re: ADJ vs. ADV generally, I was pointed to this paper which points out, for example, that adverbs can be postmodifiers of nouns ("his announcement recently that he would resign"). That and other constructions (adjectival compounds, etc.) are used to argue that the distinction cannot be made purely based on what is being modified.
In UD terms, "his announcement recently" is especially awkward because the adverbial vs. adnominal distinction is baked into the deprel, not just the POS.
http://match.grew.fr/?corpus=UD_English-GUM@dev&custom=6221941141b83&clustering=X.upos reveals inconsistent treatment of both UPOS and deprels.
The ADJ guidelines specify that the ordinal in these cases should be tagged as ADJ despite modifying another adjective (presumably as
advmod
).