UniversalDependencies / UD_English-EWT

English data
Creative Commons Attribution Share Alike 4.0 International
199 stars 42 forks source link

`fixed` differences between PUD and EWT #541

Open AngledLuffa opened 1 month ago

AngledLuffa commented 1 month ago

There are several fixed expressions marked in PUD which are not marked in EWT. Here are few:

not marked in EWT:

after all
After all, the internet is not a luxury

as if
photographs that looked **as if** they were from the 1970s

at best
**At best** it is naive and at worst it would yet again...

close to  ... similar to "approximately"
Cairo had a population of **close to** half a million

in addition   ... "furthermore"
**In addition**, statute determines the election of assembly of regions

Marked in PUD but not existing in EWT - just want to confirm we should leave it as fixed

more or less:
The working time undertaken in this first hour is more or less equal to 45 minutes.
nschneid commented 1 month ago

https://universaldependencies.org/en/dep/fixed.html documents "as if" as fixed, and "at best" as non-fixed. "after all" is not discussed but I think it's just a PP idiom (cf. "after all that happened").

Not sure whether "more or less (equal)", "close to (half a million)" or "in addition" should be added to the list of fixed expressions.

AngledLuffa commented 1 month ago

For that example of after all, it sounds like after (all that happened), whereas after all is often used by itself as well:

PUD uses:

After all, the internet is not a luxury; it is an essential tool.
After all, our organizational performance is seldom measured in terms of how safe we are or how many rules we follow.

EWT uses:

After all, if you want to be an anti-Semite, there are subtle ways of doing it.
We are, after all, in this together.
This is, after all, his business

then EWT usage is more like the example you gave:

Last year, after all was said and done
nschneid commented 1 month ago

I don't dispute that "after all" by itself is an idiom. It's just that most PP idioms are not fixed, only the ones that have weird grammatical behavior (e.g. they occur in places that ordinary PPs don't).

AngledLuffa commented 1 month ago

Got it, so you'd actually rather see after all unlabeled as fixed in PUD, in other words?

nschneid commented 1 month ago

Yeah

AngledLuffa commented 1 month ago

Thinking about it some more, after all by itself seems exactly the same as of course to me, which is marked as fixed in both EWT and PUD

amir-zeldes commented 1 month ago

We inherited a bunch of fixed decisions from SD, but I think the general sentiment is to minimize any further use of fixed as much as possible, at least in cases where the overt syntax is clear. IMHO "of course" shouldn't have been fixed either.

nschneid commented 1 month ago

We inherited a bunch of fixed decisions from SD, but I think the general sentiment is to minimize any further use of fixed as much as possible, at least in cases where the overt syntax is clear. IMHO "of course" shouldn't have been fixed either.

+1

AngledLuffa commented 1 month ago

I don't see a reason to keep "of course" - no need to inherit old decisions. We can just un-fix those in whatever way we see fit, right?