vered1986 / OKR

OKR: A Consolidated Open Knowledge Representation for Multiple Texts
Other
39 stars 13 forks source link
knowledge-representation nlp

OKR: A Consolidated Open Knowledge Representation for Multiple Texts

This is the code used in the paper:

"A Consolidated Open Knowledge Representation for Multiple Texts"
Rachel Wities, Vered Shwartz, Gabriel Stanovsky, Meni Adler, Ori Shapira, Shyam Upadhyay, Dan Roth, Eugenio Martinez Camara, Iryna Gurevych and Ido Dagan. LSDSem 2017. link (TBD).

The dataset developed for the paper can be found here (TBD).


Prerequisites:

Quick Start:

The repository contains the following directories:

Running the baseline system:

From src/baseline_system: python compute_baseline_subtasks.py ../../data/baseline/dev ../../data/baseline/test

In the entity mentions components, the F1 score we originaly reoprted was 0.58. We managed to raise it to 0.61 by changing spacy tokenization. If you want the original code that returns the original 0.58 score, set GET_ORIGINAL_SCORE to True in line 22 in eval_entity_mention.py.

The entailment component requires resources. The entity entailment resource files are found in the resources directory. The predicate entailment file is much larger, and we therefore provide the script to build it from the original resource (reverb_local_clsf_all.txt from here).

Detailed description of the OKR object:

TBD