UniversalDependencies / docs

Universal Dependencies online documentation
http://universaldependencies.org/
Apache License 2.0
269 stars 245 forks source link

Documentation of subtypes and layers #955

Closed Stormur closed 10 months ago

Stormur commented 1 year ago

I was wondering if it would not be possible to implement a more efficient way, both for redactors and users, to document transversal subtypes, multisubtyped relations and layers.

In particular:

nschneid commented 1 year ago
Stormur commented 12 months ago

Other labels for which these issues apply:

dan-zeman commented 12 months ago
  • NumValue: I would like this to be extended by means of a regular expression. As of now, it stops as 4 as a catchall for any value greater than 3, but I think this might be cause for confusion. Ideally, I would like to express any numeric value, and those are infinite!

NumValue does not seem useful to me. I just recently realized it was used in the Czech data in a completely useless manner and I removed it. And if you want to express any value of a number, then it is a semantic feature, not morphological, and it should be in MISC.

Stormur commented 12 months ago

I can understand its usefulness at least for lower numbers, though. Anyway, even if it is shifted to MISC (but could we not argue it is lexical as many others?), it still needs a way to accept possibly infinite values!

dan-zeman commented 12 months ago

Anyway, even if it is shifted to MISC (but could we not argue it is lexical as many others?), it still needs a way to accept possibly infinite values!

We could say it is lexical but in FEATS I would expect it to partition the numeral space into a finite set of categories that are in some sense interesting to be annotated. (And for many lexical features it actually means they have some specific grammatical behavior; although, as Nathan just noted in another issue, PronType may be an exception.) If it is merely to signal that three, 3 and 3.0 all have the value of 3, then we indeed have an infinite set of values, but it does not fit in FEATS. On the other hand, in MISC it is quite OK (you do not have to enumerate the values somewhere to persuade the validator that the values are legitimate).