UniversalDependencies / UD_English-EWT

English data
Creative Commons Attribution Share Alike 4.0 International
197 stars 41 forks source link

NUM missing NumForm/NumType #465

Open nschneid opened 8 months ago

nschneid commented 8 months ago

https://universal.grew.fr/?custom=653d1ce18d128

(Depends on #464)

rhdunn commented 8 months ago

In https://github.com/UniversalDependencies/UD_English-GENTLE/issues/5, I've proposed a NumForm=Alpha feature for the (a) etc. cases, as they don't currently have a sensible value to use as they are not words.

nschneid commented 8 months ago

The guidelines aren't clear about how to tag letters functioning like numbers (marking items sequentially). I think that needs to be resolved before deciding whether such items deserve a NumType. This should be raised at https://github.com/UniversalDependencies/docs/issues.

rhdunn commented 8 months ago

I've raised https://github.com/UniversalDependencies/docs/issues/983 for the discussion on alphabetic list forms.

amir-zeldes commented 8 months ago

I'm with @dan-zeman that these are not really numbers, so I don't think they should have NumType. There are languages that truly use letters as numbers, mostly ancient languages (Coptic, Biblical Hebrew, Ancient Greek etc.). In those cases I would see the case for a NumType like this, but not for English list item markers. More in the other thread.

nschneid commented 2 months ago

The question has arisen: assuming list enumerators like "(A)" are NUM, what should their features be?

(I am setting aside German spelling, where I take it "1." can be used within a sentence to mean 'first', so the period makes it explicitly ordinal.)