vcvpaiva / rte-sick

RTE Experiment
0 stars 0 forks source link

The idea is to run the sentences A and B from the SICK.txt corpus through FreeLing, and use its tokenization to feed SynaxNet to obtain the UD dependencies. We generate a CONLL file with the SynaxNet output, augmented with the lemmas and senses obtained from FreeLing.

We needed to disable FreeLing's MWE module.

We also add the original POS tag from Freeling to the MISC field so we can compare to SyntaxNet's. The POS tag from Freeling is relevant to understand the sense selected.

To reproduce:

  1. Download SICK.txt from here: http://clic.cimec.unitn.it/composes/sick.html

  2. Install Freeling 4.0 from:

http://nlp.cs.upc.edu/freeling/

  1. Install SyntaxNet from:

https://github.com/tensorflow/models/tree/master/syntaxnet

  1. Download the English UD pre-trained model from:

https://github.com/tensorflow/models/blob/master/syntaxnet/universal.md

  1. Download the SUMO knowledge base from:

https://github.com/ontologyportal/sumo

  1. Fix the paths in env.sh, process-freeling.py, and parse.sh to point to your local installation paths.

  2. Run the stats.sh script

See README in pairs/ and derivated/ for more information.