propbank / propbank-frames

Lexicon of frame files used by Propbank annotation. A searchable, readable version of the latest release is here: http://propbank.github.io/v3.4.0/frames/
Creative Commons Attribution Share Alike 4.0 International
96 stars 27 forks source link

inconsistency give.ARG1 ~> theme vs #11

Closed arademaker closed 2 years ago

arademaker commented 2 years ago

The theme is usually associated with ARG1 in both give.01 and buy.01:

http://verbs.colorado.edu/propbank/framesets-english-aliases/give.html

Arg0-PAG: giver (vnrole: 13.1-1-agent) Arg1-PPT: thing given (vnrole: 13.1-1-theme) Arg2-GOL: entity given to (vnrole: 13.1-1-recipient)

http://verbs.colorado.edu/propbank/framesets-english-aliases/buy.html

Arg0-PAG: buyer (vnrole: 13.5.1-agent) Arg1-PPT: thing bought (vnrole: 13.5.1-theme) Arg2-DIR: seller (vnrole: 13.5.1-source) Arg3-VSP: price paid (vnrole: 13.5.1-asset) Arg4-GOL: benefactive (vnrole: 13.5.1-beneficiary)

but in the promise.01 it was associated with ARG2. Any special reason for that?

http://verbs.colorado.edu/propbank/framesets-english-aliases/promise.html

Arg0-PAG: promiser (vnrole: 13.3-agent) Arg1-GOL: person promised to (vnrole: 13.3-goal) Arg2-PPT: promised action (vnrole: 13.3-theme)

This appeared in the context of our current work on revision and consolidation of the Portuguese Propbank at http://github.com/LR-POR/propbank-pt

jbonn commented 2 years ago

Hi Alexandre,

The function tags and argument numbering are closely related to associated VerbNet class Roles-- see https://uvi.colorado.edu/verbnet/promise-37.13 for an associated VN class.

While the function tags are more closely related to semantics, the argument numbering also takes into account syntactic primacy and ordering. For this verb, when all three arguments are present in a sentence, the argument numbering reflects the order in which the arguments are presented syntactically:

I promised [him] that I'd go to the store to buy bread Arg0 Arg1 Arg2

*I promised that I'd go to the store to buy bread [to him] Arg0 Arg2 Arg1 -- since this sentence is mostly bad, this ordering was considered less desirable.

The argument numbering reflects the first sentence. Of course this shakes out slightly differently for each verb depending on exactly what its syntactic behavior is, and PB rolesets may be associated with more than one VN class. But in general, for understanding PB argument numbering, the best rule of thumb is that Arg0 can correspond to syntactic subjects, and if there is a direct object, Arg1 is most likely to be associated with it. That's a better rule of thumb than that Arg1 has to be PPT.

I hope this is helpful!

Best,

Julia

On Thu, Feb 3, 2022 at 9:08 AM Alexandre Rademaker @.***> wrote:

The theme is usually associated with ARG1 in both give.01 and buy.01:

http://verbs.colorado.edu/propbank/framesets-english-aliases/give.html

Arg0-PAG: giver (vnrole: 13.1-1-agent) Arg1-PPT: thing given (vnrole: 13.1-1-theme) Arg2-GOL: entity given to (vnrole: 13.1-1-recipient)

http://verbs.colorado.edu/propbank/framesets-english-aliases/buy.html

Arg0-PAG: buyer (vnrole: 13.5.1-agent) Arg1-PPT: thing bought (vnrole: 13.5.1-theme) Arg2-DIR: seller (vnrole: 13.5.1-source) Arg3-VSP: price paid (vnrole: 13.5.1-asset) Arg4-GOL: benefactive (vnrole: 13.5.1-beneficiary)

but in the promise.01 it was associated with ARG2. Any special reason for that?

http://verbs.colorado.edu/propbank/framesets-english-aliases/promise.html

Arg0-PAG: promiser (vnrole: 13.3-agent) Arg1-GOL: person promised to (vnrole: 13.3-goal) Arg2-PPT: promised action (vnrole: 13.3-theme)

This appeared in the context of our current work on revision and consolidation of the Portuguese Propbank at http://github.com/LR-POR/propbank-pt https://github.com/LR-POR/propbank-pt

— Reply to this email directly, view it on GitHub https://github.com/propbank/propbank-frames/issues/11, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACITIQV55MUEKQNZRN3CY43UZKRPPANCNFSM5NPK5YWQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- Julia Bonn PhD student, Linguistics Research Assistant, Center for Computational Language and Education Research (CLEAR) University of Colorado, Boulder 480-452-6221

arademaker commented 2 years ago

Thank you @jbonn for the explanation about the rationality of the order of the arguments. So him in your example is both the direct object and also the first argument in the linear order of the sentence, making it a strong candidate to ARG1.

image

Actually, this raises an interesting discussion about Propbank in other languages and their mappings to the English frames and rolesets. In Portuguese, I have just fixed the annotations to reflect the decision of modeling the verb prometer.01, translation of promise.01, as http://143.107.183.175:21380/verbobrasil/textoFrames/prometer-v.html. That is ARG1 = theme and ARG2 = recipient. One of the corpus sentences is:

─┮  
 │   ╭─╼ O DET det 1 2  
 │ ╭─┶ governo NOUN nsubj 2 3  
 ╰─┾ promete VERB root 3 0                <= promise.01
   │ ╭─╼ a ADP case 4 6  
   │ ├─╼ a DET det 5 6  
   ├─┶ sociedade NOUN obj 6 3             <= ARG2
   │ ╭─╼ que SCONJ mark 7 15  
   │ │ ╭─╼ , PUNCT punct 8 10  
   │ │ ├─╼ de ADP case 9 10  
   │ ├─┾ agora ADV advmod 10 15  
   │ │ │ ╭─╼ em ADP case 11 12  
   │ │ ├─┶ diante ADV advmod 12 10  
   │ │ ╰─╼ , PUNCT punct 13 10  
   │ ├─╼ vai AUX aux 14 15  
   ├─┾ mudar VERB ccomp 15 3              <= ARG1
   │ │ ╭─╼ de ADP case 16 17  
   │ ├─┶ vida NOUN obj 17 15  
   │ │ ╭─╼ , PUNCT punct 18 20  
   │ │ ├─╼ vai AUX aux 19 20  
   │ ├─┾ garantir VERB conj 20 15  
   │ │ │ ╭─╼ o DET det 21 22  
   │ │ ╰─┾ equilíbrio NOUN obj 22 20  
   │ │   │ ╭─╼ de ADP case 23 25  
   │ │   │ ├─╼ as DET det 24 25  
   │ │   ╰─┾ contas NOUN nmod 25 22  
   │ │     ╰─╼ públicas ADJ amod 26 25  
   │ │ ╭─╼ e CCONJ cc 27 29  
   │ │ ├─╼ vai AUX aux 28 29  
   │ ╰─┾ parar VERB conj 29 15  
   │   │ ╭─╼ de SCONJ mark 30 31  
   │   ╰─┾ abusar VERB xcomp 31 29  
   │     │ ╭─╼ de ADP case 32 35  
   │     │ ├─╼ o DET det 33 35  
   │     │ ├─╼ seu DET det 34 35  
   │     ╰─┾ monopólio NOUN obj 35 31  
   │       │ ╭─╼ de ADP case 36 37  
   │       ╰─┾ criação NOUN nmod 37 35  
   │         │ ╭─╼ de ADP case 38 39  
   │         ╰─┶ moeda NOUN nmod 39 37  
   ╰─╼ . PUNCT punct 40 3  

So the obj is annotated as ARG2 and the ccomp as ARG1. Regarding the order in which the arguments are normally presented, I believe in Portuguese I would also prefer [1] compared to [2] (translations of your examples above):

  1. Eu prometi a ele que iria na padaria comprar pão.
  2. *Eu prometi que iria na padaria comprar pão à ele.

but @leoalenc, who is leading our work in the construction of an HPSG grammar for Portuguese, disagrees. Maybe he can comment here. BTW, all this discussion happen because we are using the UD Bosque corpus and the Propbank-PT to extract verb valences to the grammar lexicon. Of course, we are also interested in maintaining and improving these corpora anyway.

But back to my point, are you aware of any guidelines and best practices to translate the English frames to other languages? This source of mismatches can potentially happen a lot and I am not quite sure the current approach for Propbank-PT is very consistent. See another questionable decision in the Portuguese frames, the sense entregar.01 was mapped to two different English frames, does it make sense?

In the Euro Wordnet Project, Piek Vossen first named the two methods for constructing wordnet for a new language: the merge approach and the expand approach.

In the first case, the development of synsets and their internal semantic relations are independent of the Princeton WordNet (PWN). Afterward, the equivalence relations to PWN are generated. Such wordnets are independent of PWN and maintain language-specific properties. In the Expand method, the PWN synsets are translated into the other language and the PWN relations are taken over and adapted. Therefore, the resulting wordnets are very close to PWN.

I wonder if similar alternatives can be considered to construct a propbank in a new language. After all, almost all propbanks in languages other than English want to provide some link to the English frames and rolesets. See our https://github.com/System-T/UniversalPropositions.

leoalenc commented 2 years ago

Hi Alexandre, The function tags and argument numbering are closely related to associated VerbNet class Roles-- see https://uvi.colorado.edu/verbnet/promise-37.13 for an associated VN class. While the function tags are more closely related to semantics, the argument numbering also takes into account syntactic primacy and ordering. For this verb, when all three arguments are present in a sentence, the argument numbering reflects the order in which the arguments are presented syntactically: I promised [him] that I'd go to the store to buy bread Arg0 Arg1 Arg2 I promised that I'd go to the store to buy bread [to him] Arg0 Arg2 Arg1 -- since this sentence is mostly bad, this ordering was considered less desirable. The argument numbering reflects the first sentence. Of course this shakes out slightly differently for each verb depending on exactly what its syntactic behavior is, and PB rolesets may be associated with more than one VN class. But in general, for understanding PB argument numbering, the best rule of thumb is that Arg0 can* correspond to syntactic subjects, and if there is a direct object, Arg1 is most likely to be associated with it. That's a better rule of thumb than that Arg1 has to be PPT. I hope this is helpful! Best, Julia On Thu, Feb 3, 2022 at 9:08 AM Alexandre Rademaker @.> wrote: The theme is usually associated with ARG1 in both give.01 and buy.01: http://verbs.colorado.edu/propbank/framesets-english-aliases/give.html Arg0-PAG: giver (vnrole: 13.1-1-agent) Arg1-PPT: thing given (vnrole: 13.1-1-theme) Arg2-GOL: entity given to (vnrole: 13.1-1-recipient) http://verbs.colorado.edu/propbank/framesets-english-aliases/buy.html Arg0-PAG: buyer (vnrole: 13.5.1-agent) Arg1-PPT: thing bought (vnrole: 13.5.1-theme) Arg2-DIR: seller (vnrole: 13.5.1-source) Arg3-VSP: price paid (vnrole: 13.5.1-asset) Arg4-GOL: benefactive (vnrole: 13.5.1-beneficiary) but in the promise.01 it was associated with ARG2. Any special reason for that? http://verbs.colorado.edu/propbank/framesets-english-aliases/promise.html Arg0-PAG: promiser (vnrole: 13.3-agent) Arg1-GOL: person promised to (vnrole: 13.3-goal) Arg2-PPT: promised action (vnrole: 13.3-theme) This appeared in the context of our current work on revision and consolidation of the Portuguese Propbank at http://github.com/LR-POR/propbank-pt https://github.com/LR-POR/propbank-pt — Reply to this email directly, view it on GitHub <#11>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACITIQV55MUEKQNZRN3CY43UZKRPPANCNFSM5NPK5YWQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. You are receiving this because you are subscribed to this thread.Message ID: @.> -- Julia Bonn PhD student, Linguistics Research Assistant, Center for Computational Language and Education Research (CLEAR) University of Colorado, Boulder 480-452-6221

Thanks for your explanation, @jbonn. In Portuguese (not only in the Brazilian variety), arguments realized as PPs and CPs tend to be placed after arguments realized as NPs, compare (1) and (2) with (3), but pronominalized arguments tend to be placed before NP arguments, see (4) (in fact, this general principle seems to roughly apply to other languages as well, e.g., English, German, and French):

Arg1:theme < Arg2:recipient

  1. Flávio prometeu bicicleta à filha Flávio promise:IND;PST;3SG bike to his daughter [...]. 'Flávio promised a bike to his daughter [...].'

Arg1:theme < Arg2:recipient

  1. O vendedor [...] prometeu uma bicicleta para a filha mais nova [...] The seller promise:IND;PST;3SG a bike to his youngest daughter 'The seller [...] promised a bike to his youngest daughter [...]'

Arg2:recipient < Arg1:theme

  1. Ela prometeu à população que iria terminar todas as obras. She promise:IND;PST;3SG the population that go:COND;3SG finish:INF all the works. 'She promised the population that she would finish all the works.'

Arg2:recipient < Arg1:theme

  1. Prometeu a ela um bom aumento. promise:IND;PST;3SG to her a good raise 'He promised her a good raise.'

As noted by @arademaker, in the Brazilian Propbank-Br, the argument bearing the theme semantic role is assigned the Arg1 label, while the recipient is assigned Arg2. As I explained here, Propbank-Br consistently maps the theme and recipient (or destination etc.) arguments of ditransitives to Arg1 and Arg2, respectively, following the canonical (unmarked) order of (1) and (2). This is the pattern used for the transfer of possession, caused change of location and communication verbs I consulted:

http://143.107.183.175:21380/verbobrasil/textoFrames/dar-v.html http://143.107.183.175:21380/verbobrasil/textoFrames/entregar-v.html http://143.107.183.175:21380/verbobrasil/textoFrames/levar-v.html http://143.107.183.175:21380/verbobrasil/textoFrames/dizer-v.html

The rationale behind this is the unmarked order NP < PP < CP. Since an individual verb participant theme is preferably mapped to a NP, whatever recipient, destination, source etc. arguments the verb governs must be realized as a PP. Theme verb participants corresponding to propositions, questions etc. may be realized by CPs, infinitival clauses etc., leading to the inverted pattern Arg2 < Arg1 due to the fact that PPs tend to precede CPs etc. Pronominalized arguments also resulkt in inverted patterns, since pronouns tend to precede full NPs or PPs heading a full NP (and clitic pronouns must attach to the verb).

arademaker commented 2 years ago

Anyway, it looks like this issue can be closed. We agree that for English the argument order makes sense. The same arguments justify a different order for Portuguese.