UniversalDependencies / UD_English-EWT

English data
Creative Commons Attribution Share Alike 4.0 International
199 stars 42 forks source link

fixed expressions that function as case/mark #400

Open nschneid opened 1 year ago

nschneid commented 1 year ago

One of our goals has been to revisit the English fixed list and make it more systematic. From Dan Flickinger I obtained a list of the words-with-spaces in the English Resource Grammar lexicon that would function like prepositions (case) or subordinators (mark). Here is a comparison of that list and the current fixed guidelines:

In both lists

In ERG list, currently not listed as fixed in guidelines

Note that some of these fall under the Double Spatial Prepositions exception: The guidelines currently state that such expressions—out of, etc.—are NOT fixed. (We have considered changing this: notably it incorporates a semantic criterion, which may not be ideal for UD. UniversalDependencies/docs#795)

Apart from ADP+ADP combinations, we see some recurring grammaticalization patterns like ADV+ADP and ADP+NOUN+ADP. We may or may not want to treat these as fixed, but we should try to be consistent by group.

In the words-with-spaces list there are not many deverbal combinations like "based on" or "compared to/with". Above we see according to, as opposed to; below we see give or take, given that, going by, provided that, thanks to.

In fixed guidelines, not in ERG list

Explicitly non-fixed, also not in ERG lexicon

Other kinds of fixed expressions

Of the non-prepositional fixed expressions, ERG has lexical entries for:

The fixed page explicitly excludes the following, though ERG has lexical entries: