Closed ftyers closed 4 years ago
If both these schemes are roughly equally valid, I'd suggest marking the noun as the head, because the verb often has very poor semantic value in light verb constructions in Hindi/Urdu - you can, for instance, have the expression "jhāḍū mārnā", literally "broomstick+NOUN hit+VERB", to mean "sweep" (eg. the floor).
I'm not sure how this is in Kurdish or Turkic, though @MemduhG claims that similar constructions with semantically iffy verbs exist in Persian.
I think treating the verb as the head will be better for cross-linguistic parallelism.
I prefer the verb as the head. But I would still make "Mohan ka" a dependent of "intizar".
@dan-zeman Hmm, what relation would you give it ? obj
or nmod:poss
?
I guess using nominal host as the head would be more appropriate and would not even violate parallelism. Light verbs function more like auxiliaries. They govern case marking, agreement and TAM. While host nominal is considered to be the true predicate in such constructions.
The evidence for host nominal as the head comes from code-mixed conversational data. In Hindi-English code-mixing, Hindi and Urdu speakers usually create new predicates by using English verbs as host nominals. English verbs are not directly used, rather they form complex predicates with appropriate light verbs.
"Maine Mohan ka wait kiya."
I-ERG Mohan 's wait did."
In this example wait
behaves like a nominal and heads the genitive construction Mohan ka wait
.
Treating host nominal as the head would also solve the problem of object
being marked by a genitive. However, in the original treatment, object
gets masked as nmod
.
So what exactly is the analysis you are proposing? It is important that we maintain consistency across languages. Right now, it looks like we will end up with three different analyses of light-verb constructions, "obj" English (and many other languages), "compound:lvc" with verb as head in Persian, and "compound:lvc" with noun as head in Hindi. Are these distinctions really motivated or are we going back to "annotating the same thing in different ways" across languages?
If I am not wrong, Persian complex predicates show a similar behavior. In both languages, we can use nominal host as head. English based analysis for complex predicates i.e. using obj
is not at all valid for Hindi and Urdu and for most of the Indian languages. Objecthood is rather used as a diagnosis for differentiating between nouns that can and can't form complex predicates.
This analysis would help cross-lingual parsing as well. A lexicalized parser trained (cross-lingual word embeddings) on English data would almost always choose the nominal host over light verb as head in Hindi-Urdu.
I am still not convinced that the noun-as-head analysis is superior. If we accept that N+V is a complex predicate, it is nevertheless a verbal predicate. As far as I know, there are know examples of languages that drop the light verb in the same way that some languages drop the copula in nominal clauses. Hence, the predicate is not the noun, but the combination of verb and noun. And since the whole expression behaves like a verb, it is more natural from a syntactic point of view to keep the verb as the head. Remember: UD is syntactic annotation, not semantic role labeling. In addition, there is the practical consideration that keeping the verb as head saves us the work of reannotating the Persian treebank.
@ftyers : Definitely not obj
. Intizar is not a transitive verb. And I also do not think that the ka postposition is the way of encoding the grammatical function P
in Hindi.
nmod
would seem appropriate to me. Whether or not with the :poss
extension depends on how this extension is defined in the Hindi documentation :-) Syntactically it is same as possessives, although semantically it is nowhere near expressions like Mohan ka ghar.
Same feeling here
2017-01-18 23:24 GMT+01:00 Joakim Nivre notifications@github.com:
I think treating the verb as the head will be better for cross-linguistic parallelism.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/UniversalDependencies/docs/issues/401#issuecomment-273621150, or mute the thread https://github.com/notifications/unsubscribe-auth/AESQdw2bw6JHQZO1nhWWHJwI8X4iHahSks5rTpEBgaJpZM4Lna0c .
As long as it concerns Persian and maybe similar languages I definitely prefer verb-as-head, depending on a number of reasons:
It would seem really odd to keep the verb "eat" as dependent to "ground" or "eye", even though the verb has weak semantic contents of its own. The light verb "eat" in these examples has a different mening and kan be interpreted as "meet/face/get": "to-face ground" and "to-get negative energy by envy/evil eye". In other words, the verb "eat" still functions as the main part of the compound, as noted above, in an abstract interpretation.
The light verbs "to-eat ground" and "to-eat eye" are intransitiv constructions and as soon as the concepts turn into transitive constructions the light verb "hit" is used instead of "eat". e. g., "to-hit ground" (to hit something/someone to the ground) and "to-hit eye" (to give somebody the evil eye (Seraji, 2015).
The light verb inflects for person and number.
@jnivre: I think treating the verb as the head will be better for cross-linguistic parallelism.
I cannot agree more!
I was just looking at some CV patterns in code-mixed data that we have been annotating in UD for some time at IIITH. It seems, in case of ellipses light verbs are dropped instead of nominal host. So, if we treat light verb as the head, the arguments should be orphan, however, thats not the case if host is treated as head.
mujhe nahi pata.
I-DAT not know
I don't know.
If my assumption is correct that the full version is mujhe nahi pata hai, then I think the verb hai should be treated as copula and pata should be the head anyway.
But it does not solve the more prototypical light verbs of course.
I don't think so. You can't explain dative case on First person pronoun, if you treat it as a copular construction. pata hai
would be a pysch-predicate here assigning dative case to its internal argument.
hai would never be the head of the clause anyway, it would be a cop
.
Oh no wait, sorry, you're right - these kinds of verbs are ones I've been having issues with in Marathi (#486) too.
We will be treating host nominals as head in elided constructions while light verbs in normal ones till we reach a crosslingual consensus.
Isn't the pronoun an argument of pata and isn't the dative required by pata?
Since pata is the true predicate like other host nominals in complex predicates, it controls the argument structure and assigns case. So yes, the pronoun is an argument of pata and will receive case from it.
On Sep 9, 2017 12:54 AM, "Dan Zeman" notifications@github.com wrote:
Isn't the pronoun an argument of pata and isn't the dative required by pata?
— You are receiving this because you commented.
Reply to this email directly, view it on GitHub https://github.com/UniversalDependencies/docs/issues/401#issuecomment-328259199, or mute the thread https://github.com/notifications/unsubscribe-auth/AFRbGiripCkpI2Rhl5hnXT7pS7em5szxks5sgjYWgaJpZM4Lna0c .
We are currently having a discussion about conversion of the Hindi/Urdu treebanks to version 2.0. One of the questions we have is which token in a light verb construction such as (1)
The light verb "intizar kiya" takes Mohan as it's second core argument in genitive (because of the nominal item "intizar"), while the agreement is all on the verb "kiya".
Going on what Joakim said in Osaka, it shouldn't really matter which one we pick because the combination should be treated as a single unit. However it would be good to have consistent guidelines with respect to other languages exhibiting this phenomenon (e.g. Persian, Kurdish, Turkic).
386 and #255 are relevant here.