Closed nschneid closed 3 years ago
@amir-zeldes any opinions?
Well, I agree that from a historical perspective the current analysis is a bit simplistic, but I think it is mainly motivated by a "no better solution" situation: in many ways, it's like "a lot of" but without the "of", so you can't do a normal nmod
with "few" as the head. For "a few of the apples" I think the analysis should parallel "a lot of".
The fact that "a few" can stand by itself is true, but that doesn't mean it has to have the same analysis. Semantically it is pretty similar to "many", as you say, or also "some" as a quantity expression.
I think I could see doing "a <-det few" instead of attaching to the noun, but I would be opposed to fixed
since there is definitely the possibility of modification, here's a random example:
And in general, I would say it's relatively clear "few" outranks "a" as a possible head, so it's not really 'structureless'. The only real question is 'who does the "a" belong to'.
The main reasons I would hesitate to change this are:
amod
, it attaches like an adjective, and then if there's an article that attaches to the nounSo I guess it's mainly a question of whether that would make the analysis much better, and I'm not super sure about that. I guess it would be nmod:npmod
to the noun, right? Or what were you thinking for the function of "a few" with respect to the plural noun?
On the view that it's been grammaticalized a compound determiner, it could be det
with respect to the noun.
But mainly I just want it to be a constituent. Even with rare internal modification (not sure if "a scant few" is a frozen expression or not), consider:
But article+adjective with an elided noun is not possible in general:
So I think this is a good argument for making a+few into a constituent. As is, "a" is attached to "few" only when there is no head noun, so the benefit of changing it would be consistency.
It occurs to me that this is similar to article+number expressions like "a hundred", which also permit plural noun ellipsis: I bought a hundred (books).
GUM treats "a hundred" as a constituent (det(hundred, a)
, which "hundred" attaching as nummod
). So treating "a few" and "a little" as constituents would be similar to that.
Yes, "a hundred" is pretty natural to treat that way since we need to deal with "one hundred" as well, which has a similar structure. It also doesn't create any friction in terms of labeling 100, since that's still nummod. But that brings me back to my earlier questions: what do you want the deprel of "few" to be?
My main hesitation in touching this (aside from it being work) is that I find nmod:npmod
on "few" less elegant than the current amod
, and det
on amod
feels kind of wrong. With the current analysis it's not exactly perfect, but I get what it's saying and the labels are rather tame looking.
I agree det
modifying an adjective feels kind of wrong, but that's what we are currently forced to do if the head noun is omitted. The fixed
analysis makes the most sense to me—i.e. capturing that "a few" and "a little" are two syntactically very special multiword expressions combining what was historically a determiner and an adjective. ("A lot" and friends are semantically special but not as much syntactically because "lot" is historically a noun.)
det
on an adjective with no noun is the normal way of handling 'promotion' (or more traditionally, nominalization of the adjective), so I don't find it bad. It's the same as "the poor", or "the following". My concern was having something with deprel amod
having the det
, which this would introduce in this construction.
I think I feel pretty strongly against fixed
for a number of reasons:
As I said above, I think if anything nmod:npmod
makes the most sense if we want "a few" as a constituent, since it would be analyzed as a full modifier NP indicating extent without a preposition. But I don't think that's really right, and honestly this just makes me want to stick to amod and leaving "a" as det to the noun in one of those "UD just likes fountainy graphs" moments (there are quite a few of these murky fountains, like "not only" (not doesn't modify only in EWT), or "to" attaching to a predicate noun/adj in things like "to be good". It's not 100% right, but it makes life a bit simpler and I've just come to accept it in the name of lexicocentrism I guess.
det
on an adjective with no noun is the normal way of handling 'promotion' (or more traditionally, nominalization of the adjective), so I don't find it bad. It's the same as "the poor", or "the following". My concern was having something with deprelamod
having thedet
, which this would introduce in this construction.I think I feel pretty strongly against
fixed
for a number of reasons:
- Whatever we think "scant" is, the construction is modifiable, so you'd be introducing potentially quite a few "fixed" expressions to an already long list
Here's CGEL (p. 392):
Really "a few" and "a little" are special constructions that permit very limited internal modification. Maybe UD needs an almost-fixed
relation. ;)
- If fixed is really only used for things where headedness is meaningless, I don't think this is such a case: I'm pretty sure "few" outranks "a"
I don't have that intuition. I think they act in concert.
- It obscures the fact that "a" is actually fulfilling a fairly normal role here, and erases the parallelism with "few" without an article. Currently we have "few dogs" and "a few dogs" being structurally very similar (both have an adjectival modification indicating they are few).
But if you think the indefinite article is acting as usual then it's very odd that it's modifying a plural noun, yes? You can't say "a several dogs" or "a many dogs". "A few" is simply a syntactically special expression.
I agree it's a special construction, but there are lots of special constructions and we only have very few labels... The same kind of a+plural appears in a number of places, for example "a great many" etc., so it's not quite unique.
In any case, it sounds like we agree that it's not quite or only almost fixed, so I think we should take fixed off the table. If that makes sense, then my question remains, what would the deprel be? I think if we change it, few should be the head and "a" should be a dependent as it usually is.
flat
?
The flat relation is one of three relations for multiword expressions multiword expressions (MWEs) in UD (the other two being fixed and compound). It is used for exocentric (headless) semi-fixed MWEs like names (Hillary Rodham Clinton) and dates (24 December). It contrasts with fixed, which applies to completely fixed grammaticized (function word-like) MWEs (like in spite of), and with compound, which applies to endocentric (headed) MWEs (like apple pie).
I would say the definition fits because we are talking about a headless semi-fixed MWE. It is more grammaticized than names but there's nothing saying that grammaticized expressions cannot be flat
, only that the completely frozen ones are fixed
.
"a good/great many": these are also MWEs acting as quantifiers with special morphosyntax. a many books, a very/considerable many books, *a great several books. CGEL p. 394:
I'd be against flat
for the same reasons as fixed
. It's not really similar to any of the current uses of flat
in English, and I really don't see a reason not to treat "a" as a determiner here. "a" generally doesn't have dependents and typically modifies both nouns and nominalized adjectives, so it fits perfectly as det
. If I wanted "a few" to be a constituent (i.e. dependency chain), I think it would be most parallel to the extent modifiers found with spatio-temporals and other quantities. Compare a possible:
To existing analyses like:
I'm still not sure it's a huge improvement over what we have now, but the idea of "a" heading a flat
expression seems wrong when it's basically just a determiner and looks exactly the same as in the independent "a few" functioning as an argument (I assume you wouldn't make that be flat
either way, right?)
So I guess the crux of the question is whether it is a nominalized adjective like "the poor" or "the British". I don't see it that way—yes you could say "the many as opposed to the few" just like "the rich as opposed to the poor", but this strikes me as an entirely different usage from "a few" as in "I bought a few". And note that while you can say "the many as opposed to the few" you cannot say "a many books", so there is something special about "a few" and "a great many".
Can the normal nominalized adjective construction be used with an indefinite article? Here are results with indefinite article + ADJ other than "few" or "little", and they look like annotation errors (attachment errors or lexemes with distinct nominal senses that I'd tag as NOUN, e.g. "a contemporary" = 'a person who is around at the same time as someone'), not a productive construction where an adjective is coerced to a nominal.
I'm still not sure it's a huge improvement over what we have now, but the idea of "a" heading a
flat
expression seems wrong when it's basically just a determiner and looks exactly the same as in the independent "a few" functioning as an argument (I assume you wouldn't make that beflat
either way, right?)
"I bought a few"—I'd want that to look similar to "I bought many". This could be achieved with flat
+ an ExtPos feature. Remember that flat
is really asserting that there is no head syntactically—"a" and "few" are on equal footing—there's only a head in the tree for data structure convenience.
I'd want that to look similar to "I bought many"
Yes, but I'd like "I bought a few" to look similar to "I bought few"...
Yeah that would be achieved too with flat
and ExtPos=ADJ.
I'm not sure I understood you- are you saying "a few" by itself should also be flat? If so, what about "the few" in "the few I know"? And what would you do about things like:
I think keeping "a" as det to something makes the most sense given the flexibility and potential internal modifications of few.
Oooh fun. "A long few days"—hadn't thought about this. I think it's yet another construction ("a" + adjective quality modifier(s) + quantifier): "a tough several days", "an unprecedented very long 8 days". So I would say that in that case "few" is acting as a normal quantifier-adjective, and this construction licenses a special use of "a" but I'm not sure whether to say the quantity adjective itself licenses "a". I.e. it seems like a semiproductive construction where "a" is special but not part of a frozen expression, so UD won't have the capacity to capture it and det(days, a)
is probably the best we can do.
Note that "a long few days" and "a few long days" parse differently for me, even if they work out to meaning similar things in practice. (cf. a long many days vs. *a many long days)
Another few weeks, the next few weeks, etc.—I think here "few" is just a regular quantifier adjective. (cf. another several weeks, the next several weeks)
I'm not sure I understood you- are you saying "a few" by itself should also be flat? If so, what about "the few" in "the few I know"?
"The few I know" fits the definite-article + adjective-coercion-to-nominal construction, so I think that would be det
just like "the rich".
Yes, flat
+ ExtPos=ADJ for "a few" whether it is by itself or not. The test is: can it be grammatically replaced by "many"? If so, analyze it as a syntactic equivalent that just happens to have two words without internal structure.
I think "a long few days" and "a few days" is the same "few". I would consider all of the above constructions to contain the article with its normal label of det, and it think it would be the most intuitive and reliable for annotators as well.
If "few" works as an adjective in "my very few books", it appears that "a few" works as a determiner in "a few books" because it blocks up the position and prevents any other determiner to appear. So I think that "a few" must be det
with ExtPos=DET.
About the internal structure of "a few" you showed that some adjectives are possible so it is possible not to treat "a few" as a fixed expression and to analyse "a" as a det
of "few". I also seems that "few" in "a few" acts as a NOUN in this case. Maybe I am influenced by French where we have the ADV/NOUN "peu" which acts now as an adverb ("il lit très peu" 'he reads very little') but was a noun and appears in many fixed expressions with a determiner (il lit un peu 'he read a few'; le peu que j'en sais 'the little I know', etc.).
I agree with @nschneid that "a long few days" might be a different construction, where "few days" is a unit and not "a long few".
If "few" works as an adjective in "my very few books", it appears that "a few" works as a determiner in "a few books" because it blocks up the position and prevents any other determiner to appear. So I think that "a few" must be
det
with ExtPos=DET.
Oh good point. So "I bought a few books" has ExtPos=DET. What about "I bought a few"—ExtPos=ADJ or DET? Note that "many" is always tagged as ADJ, whereas "some" is always tagged as DET. Both can appear without a head noun: I bought many/some. I guess treating "a few" as DET across the board, like "some", would make sense (the many books / the a few books / the some books).
Re: "few" acting as a noun...I agree it can do this in some contexts (e.g. with a definite article), but in "a few" it is hard for me to say it is more noun-like than "some" or "many" apart from "a" looking superficially like an article. And a DET MWE with an internal DET feels a bit awkward to me. Seems like the more neutral solution is to say, this is a weird expression that doesn't generalize to words other than "a" + "few/little" + occasionally an internal modifier, and flat
is a way to achieve that.
And from a native speaker intuition perspective, I am having trouble conceptualizing "few" and "little" as nouns (as opposed to "lot" or "bit"), except when coerced with a definite article.
Let me recap my arguments against flat/fixed:
I agree that there are interesting and subtle differences between the various constructions, but I think an average user would expect these to look similar:
I know a few a few that I know the few that I know few that I know (would say that) our last few remaining problems
Across these related constructions, "few" can be combined with most normal determiner options, suggesting that the "super-schema" of what they have in common basically calls for one of the standard English determiners (incl. zero). I think that making "few" be a child of the determiner in only some of these is asking for annotator disagreements and parsing errors, and I don't see any real benefit.
If the goal is to have "a few" as a phrase, then I think it should follow the normal determiner as child + det option, which would maintain the status quo for independent "a few" and keep parallelism to the other constructions with "few".
Let me lay out my argument for the different constructions before getting to whether it is practical or not for annotators.
I think the key syntactic tests for MWE status are:
I would advocate the MWE analysis (with flat) only for cases that pass the first test and fail the second, showing that the expression is particular to "a" + "few"/"little" and that together they function like "some". For example:
obj(bought, books)
, flat(a/DET, few/ADJ)
, a+few: ExtPos=DET, det(books, a)
obj(bought, few)
, flat(a/DET, few/ADJ)
, a+few: ExtPos=DETobj(bought, books)
, amod(books, few/ADJ)
obj(bought, books)
, amod(books, few/ADJ)
, det(books, the)
obj(stayed, days)
, det(days, a)
, amod(days, long)
, amod(days, few)
So while on the surface these are similar:
a few that I know the few that I know
the above tests distinguish them (cases 2 & 4).
Now, we have the question of whether it is practical to have UD annotators remember to use an anomalous deprel of "a" in the first two cases (flat rather than det). I can see the point that it requires a lot of nuance. While I think flat is more of a "proper" MWE treatment, I think it is OK to fudge a bit on the internal structure deprels to make them look more like the typical uses of those deprels, so I could envision the following compromise that does away with flat
:
obj(bought, books)
, det(few/ADJ, a/DET)
, few: ExtPos=DET, det(books, few)
[whole expression considered a determiner to address @sylvainkahane's point]obj(bought, few)
, det(few/ADJ, a/DET)
, few: ExtPos=DETobj(bought, books)
, amod(books, few/ADJ)
obj(bought, books)
, amod(books, few/ADJ)
, det(books, the)
obj(stayed, days)
, det(days, a)
, amod(days, long)
, amod(days, few)
Basically the principle would be, "a" modifies "few" rather than the head noun apart from case 5, with an intervening adjective that doesn't modify "few". We already do this if there is no head noun, so the change would make things more consistent.
I understand the reluctance of @amir-zeldes to use fixed
for "a few" because the internal structure is quite clear and the expression is not completely frozen, seing that some ADJ can modify "few". But according to UD choices, I think that "a few" must be considered as a fixed expression. The main argument is the fact that "few" and "a few" don't have exactly the same distribution: "few" works as an amod
, while "a few" works as a det
.
(Note that in SUD we decided not to use the relation fixed
, and to indicate the internal syntactic structure of idioms and use features PhraseType=Idiom
and ExtPos
on the head of the idiom and InIdiom=Yes
for the the words inside the idiom. It means that we will use the relation det(few, a)
and add PhraseType=Idiom
and ExtPos=DET
on "few".)
@nschneid I don't think flat
is a possible solution. If we consider "a few" as a fixed expression, the relation must be fixed
, if not, the relation must be det(few, a)
.
I agree with @sylvainkahane that there is internal structure, but I think that rules out not only flat
, but also fixed
, since UD stipulates that fixed
goes left to right from the first token. If it is not completely frozen, as you say above, then it is also not fixed. The fact that "few" and "a few" are not distributed the same is true of many items with and without an article, but still, when the article is there I think it is the regular determiner, and the NP "a few" has a certain function then, which can be non-identical to the function of "few".
The about @nschneid 's suggestion: I would feel uncomfortable doing a chain of det
s, which is otherwise unattested in English. I understand the appeal of treating it like a complex determiner, but I think here we run up against UD's token-centric conventions, which also mandate that "hours" in "dance two hours" is not advmod
, since "hours" in itself is not an adverb. I think since we are dealing with a phrasal modifier of nouns, it should be nmod:npmod
, and this also avoids the awkward det
chain. It's also not unusual for an nmod subtype to take up a specifier position, just like possessive nmod:poss
can mark a genitive possessive or a possessive article (my, your etc.)
Finally, specifically for 5 above, if we want a phrasal "a few" analysis, then I don't think "a" should modify "days" (though currently it would). I think although semantically it is the days that are long, syntactically we have:
det(few, a) amod(few, long) nmod:npmod(days,few)
The about @nschneid 's suggestion: I would feel uncomfortable doing a chain of
det
s, which is otherwise unattested in English. I understand the appeal of treating it like a complex determiner, but I think here we run up against UD's token-centric conventions, which also mandate that "hours" in "dance two hours" is notadvmod
, since "hours" in itself is not an adverb. I think since we are dealing with a phrasal modifier of nouns, it should benmod:npmod
, and this also avoids the awkwarddet
chain. It's also not unusual for an nmod subtype to take up a specifier position, just like possessivenmod:poss
can mark a genitive possessive or a possessive article (my, your etc.)
I see your point but nmod:npmod
indicates we are analyzing "a few" as a nominal expression, which seems wrong when it's in specifier position ("a few books"). To me it is bare "a few" ("I bought a few") that is coerced from a compound determiner into a nominal, not that it is always a nominal.
Finally, specifically for 5 above, if we want a phrasal "a few" analysis, then I don't think "a" should modify "days" (though currently it would). I think although semantically it is the days that are long, syntactically we have:
det(few, a) amod(few, long) nmod:npmod(days,few)
But this makes "a long few" into a constituent, which I don't see any evidence for.
I think it is a nominal expression, that's why it can take "a" in the first place, no? Compound determiners generally still have an underlying part of speech, and I think it's the same "a few" as always (so either a noun, or an adjective nominalized into a noun, but either way a nominal).
But this makes "a long few" into a constituent, which I don't see any evidence for.
Some examples with an adjective but without a noun:
I feel really put off by the look of det <- det, especially with it being unparalleled elsewhere in English dependency analyses, and probably being typologically very rare across UD. In fact, I'm starting to think that if it is so unclear then maybe the best solution is actually the current one, which is possibly a bit naive but at least easy to explain and consistent across constructions.
"a select/specialized/fair/rare few"—these are all internal modification of "a few" (which is why it's arguably not fixed
). For me the modifiers allowed here are limited to explaining how few or particularized the amount is, not arbitrary properties of the thing being quantified.
The "bad few" example is ungrammatical for me. Perhaps a different dialect.
the modifiers allowed here are limited to explaining how few or particularized the amount is, not arbitrary properties
I think those are semantic distinctions and not syntactic ones. Once the NN is dropped, the only thing left for modification is "few" either way. I'm also not really sure that's true of some of these, for example being a "select" few is a property of the thing counted, not of the count itself. But if you want clearer examples of semantically 'non-quantitative' modifiers, like "bad", those aren't too hard to find:
Interesting. This is a sense of "few" that at least some dictionaries categorize as a noun. I think it prototypically means a minority of people. And I notice the adjective is often evaluative; maybe this is part of the construction's core meaning (singling a small group of people out as aberrant).
Could it refer to a small quantity in general? I'm not sure:
?Though I enjoy eating berries, occasionally the experience is marred by a sour few.
??You can have all the sweet berries; I'll just have a sour few. (Better: a few sour ones)
??While the treatment used to require upwards of 12 visits, now it can be completed in only a short few.
Anyway I'm becoming convinced that there's a continuum of grammaticalization at work here, with several interrelated and nuanced constructions, which is why categorization is tricky.
Yes, the more I look into examples the more I realize how productive this construction is, and it's hard to find clear boundaries between classes. For me this all speaks for just leaving it alone, and definitely staying away from fixed solutions.
BTW there are plenty of examples of non-humans with adjectives, for example:
OK but all of these seem to me like quantity-modifying adjectives rather than property adjectives. "a good few (budgies)" doesn't mean a few budgies that are good. One could have a good few bad budgies.
I'm polling people on social media to see how they feel about the grammaticality of some of these. And getting mixed responses. Clearly it's complicated!
Hm, I see, well if you want 1. the few construction, 2. for the noun to be missing, 3. for there to be an adjective modifying few, 4. for that adjective to be semantically non-quantitative and 5. for the entire phrase to stand for a non-human... that's just going to be very rare by virtue of the chain rule - but I'm not sure if that's important for the syntax.
If it is, then wading through corpus examples I can offer these cases, some are more borderline than others:
But TBH your example "a good few bad budgies" suggests to me that adjectives belonging syntactically to the 'lexical' noun should appear between it and "few", so if "long" were a syntactic modifier of "days", we should get "a few long days", not "a long few days". Semantics aside, I think the difference in position probably corresponds to a syntactic subordination difference.
That all being said, I'm fine with keeping everything always being a dependent of the lexical noun if available, and of "few" if not, which is the status quo.
Yeah I'm hearing various levels of discomfort about some of these...a lot of "I definitely wouldn't say that but maybe someone would". So there's probably no fixed set of boundaries that all speakers would agree on, just some more prototypical and less prototypical cases.
Even if I had a clear linguistic analysis of the full range of constructions, it probably wouldn't be obvious how to map them to UD in practice. So let's just stick with the status quo for now.
I would like to discuss the analysis of "a few" (and later "a little").
Typically it modifies a plural noun, in which case the current analysis almost across the board in EWT and GUM is to attach "a" and "few" separately to the head noun, with
det
andamod
respectively.I'm not sure this is the best analysis, however: consider that "a few" can stand alone for the NP, similar to "some" or "several":
"A few" can also be coordinated with quantifiers/quantities:
It can be followed by an "of"-PP:
And it can modify things other than plural nouns:
The "a" cannot be omitted or replaced with "the" and receive the same interpretation, nor can a possessive + few receive that interpretation:
Taken together, the expression functions a lot like determinative "many".
"A little" behaves very similarly except for mass rather than plural nouns:
Would it be better to analyze these as
fixed(a, few)
,fixed(a, little)
?Here are the exceptions to the typical analysis in the corpus (including a couple of errors):
Note that there are other quantificational expressions with "a" + NOUN—a couple, a bit, a lot, a bunch. These already form a constituent with the article attaching to the noun as
det
.Internal modification: some of these expressions allow intensification, but these may be lexicalized expressions as well. The ADJ ones seem less flexible than some of the NOUN ones:
a + ADJ
a + NOUN