Open mr-martian opened 1 year ago
@daghaug @gcelano
If I can add my 2 cents, however coming from my experience with Latin (if harmonisation with it has some importance), I could comment:
nmod:poss
: we tried to apply it, but it is extremely difficult to do so and one wonders about its significance (so, it is not used currently in Latin)acl:relcl
: this is de facto mandatoryobl:tmod
: I think this is useful and it can be defined reasonably (however, one has to think about its possible conflict with obl:arg
). We are now using it regularly. obl:npmod
: we are not using it and I sincerely do not think it makes sense, especially not in a language with case inflection. Also, from the little I glimpse, sometimes it seems to be rather an advmod
/advcl
.cc:preconj
: I do not think this has any meaning at all since it is directly retrievable from the linear order of tokens. Latin is not using it (for the same reason that it is not using features such as AdpType
). advcl:relcl
: this absolutely needs a documentation defining it, because as of now we have two different relations using this label. In Latin it is for free relatives; in English it is for "sentence relatives", which Latin currently treats by means of advcl:pred
.
advcl:relcl
: this absolutely needs a documentation defining it, because as of now we have two different relations using this label. In Latin it is for free relatives; in English it is for "sentence relatives", which Latin currently treats by means ofadvcl:pred
.
English also uses advcl:relcl
for free relatives where the WH word is an adverb, e.g. "I looked where you were sitting": advcl:relcl(where, sitting).
So there are two concurring uses of advcl:relcl
? And what for "non-adverbial" relative words?
So, how I'm currently using them:
nmod:poss
this is essentially "is the dependent Case=Gen
or Poss=Yes
?" which, yeah, not that helpfulobl:npmod
my starting point was word aligning with the Hebrew treebank (which uses this for the infinitive absolute) and projecting, so this relation is present in places where the Septuagint copies the Hebrew construction of reduplicating the verbcc:preconj
this is currently used for τε, but not for sentence-initial conjunctions, so I agree it's probably not usefuladvcl:relcl
I copied the English usage on this oneSo there are two concurring uses of
advcl:relcl
? And what for "non-adverbial" relative words?
I've updated the docs to explain this more clearly: https://github.com/UniversalDependencies/docs/blob/pages-source/_en/dep/advcl-relcl.md
(The page on the site isn't updating for some reason)
obl:npmod: we are not using it and I sincerely do not think it makes sense
It is an odd label linguistically, to be sure, but if you want to use obl:tmod, then I think you will probably need obl:npmod as well. The tmod label is used for temporal noun phrases used adverbially, as in 1. When a similar phrase describes a non-temporal quantity, you need some kind of label, and that's what obl:npmod does:
It has been pointed out that obl:tmod
isn't really a syntactic category but more of a semantic subtype, so in a way obl:npmod
subsumes it and I suppose it would basically cover accusativus graecus.
cc:preconj: I do not think this has any meaning at all since it is directly retrievable from the linear order of tokens
I think this is not 100% true, but realistically you are right that it is mostly predictable. Hypothetically you could get something like "I arrived and/cc then both/cc:preconj danced and/cc sang", where it's not totally obvious what would be cc:preconj
. That said, even when it is trivial, it's sometimes nice to be able to easily find all cases that have a cc:preconj, and it's easy enough to do, so why not?
nmod:poss this is essentially "is the dependent Case=Gen or Poss=Yes?" which, yeah, not that helpful
That may be true, but it might still be nice for comparability to other languages which use nmod:poss
.
So, how I'm currently using them:
* `nmod:poss` this is essentially "is the dependent `Case=Gen` or `Poss=Yes`?" which, yeah, not that helpful
I thought of it in part as the difference between subjective/objective genitive (e.g., for the wider public, amor matris 'the love for the mother vs. the love from the mother', both expressed by the genitive), but then I am not sure we can label the subjective one as "possessive"; probably this pertains at some level of reference annotation? Given an nmod
relation, the feature Poss=Yes
should not change this picture.
* `obl:npmod` my starting point was word aligning with the Hebrew treebank (which uses this for the infinitive absolute) and projecting, so this relation is present in places where the Septuagint copies the Hebrew construction of reduplicating the verb
But then, is it still related to Latin? :thinking:
obl:npmod: we are not using it and I sincerely do not think it makes sense
It is an odd label linguistically, to be sure, but if you want to use obl:tmod, then I think you will probably need obl:npmod as well. The tmod label is used for temporal noun phrases used adverbially, as in 1. When a similar phrase describes a non-temporal quantity, you need some kind of label, and that's what obl:npmod does:
1. Let's meet next week/obl:tmod 2. Let's meet the way/obl:npmod we planned originally
It has been pointed out that
obl:tmod
isn't really a syntactic category but more of a semantic subtype, so in a wayobl:npmod
subsumes it and I suppose it would basically cover accusativus graecus.
It is, as many others, and for this reason it appears only as a subtype. Many subtypes (most?) are semantic, even relcl
is in some sense (there just happen to be a reference to something in the matrix clause).
We are using it "transversally", so it also appears for advmod
.
I do not think that tmod
and npmod
are related exactly for this reason: with regard to "adverbiality", this is already subsumed under UD's use of he oblique obl
relation; so tmod
is purely semantic, or let's say lexical, in that it depends either on the word (e.g. semper 'always') or on the predicate (e.g. vivo 'to live' with some argument denoting an event). I am not sure why it should cover accusativus graecus if this is already covered by obl
(in its current interpretation) and if the purely syntactical fact of not being introduced by an element like an adposition is self-evident: what I mean is that a simple treebank query directly retrieves such cases.
In the example
2. Let's meet the way/obl:npmod we planned originally
I do not see what it is adding. It is already obl
, and the fact it appears as such without a preposition is probably lexically determined, so maybe it should be annotated at a token level. If np
stays for noun phrase, it is stating the obvious, as an oblique is already intended to be one.
cc:preconj: I do not think this has any meaning at all since it is directly retrievable from the linear order of tokens
I think this is not 100% true, but realistically you are right that it is mostly predictable. Hypothetically you could get something like "I arrived and/cc then both/cc:preconj danced and/cc sang", where it's not totally obvious what would be
cc:preconj
. That said, even when it is trivial, it's sometimes nice to be able to easily find all cases that have a cc:preconj, and it's easy enough to do, so why not?
Hm... this might one further reason to tinker with UD's annotation of co-ordinations :thinking: I admit this still does not convince me totally about the usefulness of this subrelation instead of moot redundancy for a very functional relation...
nmod:poss this is essentially "is the dependent Case=Gen or Poss=Yes?" which, yeah, not that helpful
That may be true, but it might still be nice for comparability to other languages which use
nmod:poss
.
True, but then we need a clear definition which as of now does not seem to be there. There is probably also an overlap with det
... or also just with the fact of a PronType=Prs
depending as nmod
?
So there are two concurring uses of
advcl:relcl
? And what for "non-adverbial" relative words?I've updated the docs to explain this more clearly: https://github.com/UniversalDependencies/docs/blob/pages-source/_en/dep/advcl-relcl.md
(The page on the site isn't updating for some reason)
I am now wondering if these are not or are indeed two different phenomena. I am sincerely confused.
... but is the subclause in I looked where you were sitting not rather an object of the main verb? I would instead think of somethong like Go back whence you came (correct?).
An adverb can't be a direct object in UD, right? I think an obj
has to be a nominal.
(I agree the location phrase is a complement/argument of "look" here, but that's not what UD cares about.)
It is already obl
Yes, obl:npmod and obl:tmod are subtypes of obl, so that part is natural. In many datasets, including English but also others such as Hebrew or Coptic, the plain obl is used specifically for prepositional phrases. I suspect it was originally a conversion remnant from Stanford Dependencies, which distinguished prep
from npadvmod
and tmod
. These became the prototypes for nmod/obl, obl:npmod and obl:tmod.
Of course, the subtypes are totally optional, but that is the background for why all adverbial NPs (usually with some kind of spatiotemporal or extent semantics) have a subtype in languages that use them. So if you are using :tmod
, I would also expect to see :npmod
for non-temporal phrases. TBH if I were designing UD from scratch I would have just called such NPs advmod
too, since that is essentially what accusativus graecus is, but advmod
is prohibited on things not tagged ADV
, so we have to use some kind of obl
relation - the subtype is just to keep them separate from PP modifiers.
An adverb can't be a direct object in UD, right? I think an
obj
has to be a nominal.(I agree the location phrase is a complement/argument of "look" here, but that's not what UD cares about.)
I was perhaps confused by the fact that look is intransitive in English. But I missed the more important fact that where is "promoted" in the matrix clause. But if this is the case, I do not understand why, keeping advmod
(look,where), you were sitting is not just acl:relcl
as the "expansion" of where.
Probably I see where this is coming from: an ADV
entails an advcl
(propositional) and not an acl
. But I do not know if this is not accepted by UD/the validator (and actually, this is one further case showing that where is not an "adverb", but a kind of pro-form). Still, another annotation strategy solving it would be to have where you were sitting as a whole as advcl:relcl
of look, and then this use of advcl:relcl
would be the same as for Latin. But I know the treebanks treat "free relatives" differently.
So if you are using
:tmod
, I would also expect to see:npmod
for non-temporal phrases
Sorry if I am firm about this, but no. There is no logical relation. This all comes from some language-specific logics projected universally. Especially for Latin and Ancient Greek (and many other languages), there is nothing special about prepositionless arguments, as prepositions are just in alternation with Case
.
I understand where this comes from, but I see (universally) more sense in a (hypothetic) semantic obl:manner
for the way rather than a mechanical obl:npmod
.
As for accusativus graecus, one might still envision an adv*
annotation, but with advcl
, maybe as advcl:pred
(by the way, I personally think it is still left to be convincingly proven that accusativus graecus is really an adverbial rather than a second object... but this is another story). Anyway, the relation obl
already means (or at least covers) something like "nominal adverbial": then the more meaningful subtype to be used here is arg
, to keep track of a parallel complement/adjunct distinction, if this is what an advmod
label would imply.
more sense in a (hypothetic) semantic obl:manner for the way rather than a mechanical obl:npmod.
Sure, that would be perfectly logical and seems fine to me. npmod is just the underspecified one (not saying if it's manner, or extent or something else). I don't much like the label either (no NPs in dependencies), it's just a legacy thing from SD.
As for accusativus graecus, one might still envision an adv* annotation, but with advcl, maybe as advcl:pred
Not if it's not a clause - then I would have expected (and wanted) advmod
, but that is forbidden for nouns, and I lost that battle long ago ;)
there is nothing special about prepositionless arguments ... Anyway, the relation obl already means (or at least covers) something like "nominal adverbial"
Yes, that's all correct and UD takes that position explicitly in having obl
be the main label for cases with and without prepositions. It's just that in some languages maintainers like to make that distinction, so they use subtypes - these are in no way mandatory. I think if you are using "tmod" also for adverbs and phrases with prepositions, and not using a subtype for other domains, it just ends up being different from how other languages use that subtype. But maybe that's OK - I was just pointing it out, since that subtype comes from UD English and is used differently there.
But maybe that's OK - I was just pointing it out, since that subtype comes from UD English and is used differently there.
Hm, I have to look into it. But reading from the scant documentation, we seem to be in line. I do not see differences... it is simply independent from adpositions, even in English (judging from the examples in the documentation). tmod
itself as a label might come from UD English, but "time complements" are universal...
We also use lmod
. I fear other domains would be less defined and more problematic than these ones. Besides, I have not noticed attested relation subtypes for them, apart from subsubtypes of time and place.
As for accusativus graecus, one might still envision an adv* annotation, but with advcl, maybe as advcl:pred
Not if it's not a clause - then I would have expected (and wanted)
advmod
, but that is forbidden for nouns, and I lost that battle long ago ;)
It might be a nominal clause. But I agree that it would be a lectio difficillima (a 'very difficult interpretation'), not even truly justified. So, currently, obl
still is the best (traditional) option.
it is simply independent from adpositions, even in English
I think that might be ambiguous - just to clarify, in UD English and related datasets following its practices, :tmod
only occurs when there is no preposition
So, currently, obl still is the best (traditional) option.
Agreed!
I don't think this is actually resolved. I've been stripping subtypes from my treebank in the process of pushing to the UD repo, but I'd still like to actually include them.
OK, but then it needs a new milestone. v2.13 is over.
Currently, Ancient Greek has the following subtypes enabled:
advcl:cmp, advmod:emph, aux:pass, csubj:pass, flat:foreign, flat:name, nsubj:outer, nsubj:pass, obl:agent, obl:arg
In PTNK, I have additionally made use of the following:
Should I document these or should I reduce some or all of them to the non-subtyped relation?