vered1986 / OKR

OKR: A Consolidated Open Knowledge Representation for Multiple Texts
Other
39 stars 13 forks source link

Integrating PropS proposition extraction #4

Closed gabrielStanovsky closed 7 years ago

gabrielStanovsky commented 7 years ago

@kleinay This is a first pull request adding the ability to get the Json objects we discussed. You can see a usage example in the "main" of the PropS wrapper, and also an example output json.

Please skim these changes, and feel free to ask to change anything that you think can be done better, or ask for clarifications where needed.

Let's discuss these changes here, and decide together when we want to merge them upstream.

Thanks!

kleinay commented 7 years ago

regarding the "Template" atribute of a predicate: I wonder whether there is a way to facilitate the usage of such template (replacing signs in a string demands a bit proccessing).

gabrielStanovsky commented 7 years ago

Changes in c789353729557bda13fe0fb5e31c6467b8c4625b:

Additionally, in 22e4849d4d2b4d7294baa69438b3610b71eb5d05 I added list of word indices for both predicates and entities, this usually comes in handy at some point :) So now, everything that includes surface words should be a tuple of words and indices:

{'Entities': {'A1': ('The Syrian plane', (0, 1, 2)), 'A2': ('Moscow', (8,))},
 'Predicates': {'P1': {'Bare predicate': ('was forced', (3, 4)),
                       'Head': {'Lemma': u'force',
                                'POS': 'VBN',
                                'Surface': ('forced', 4)},
                       'Template': '{A1} was forced {P2}'},
                'P2': {'Bare predicate': ('to land in', (5, 6, 7)),
                       'Head': {'Lemma': 'land',
                                'POS': 'VB',
                                'Surface': ('land', 6)},
                       'Template': '{A1} to land in {A2}'}},
 'Sentence': 'The Syrian plane was forced to land in Moscow .'}
kleinay commented 7 years ago

I have another change request for the Predicate data - please add a "Arguments" attribute, containing the symbols in the template. e.g. for predicate with template '{A1} to land in {A2}', add "Arguments":['A1', 'A2'] to its information. should be easy.

gabrielStanovsky commented 7 years ago

Done, please review and approve here: https://github.com/vered1986/OKR/pull/5