vered1986 / HypeNET

Integrated path-based and distributional method for hypernymy detection
Other
85 stars 13 forks source link

HypeNET: Integrated Path-based and Distributional Method for Hypernymy Detection

This is the code used in the paper:

"Improving Hypernymy Detection with an Integrated Path-based and Distributional Method"
Vered Shwartz, Yoav Goldberg and Ido Dagan. ACL 2016. link

It is used to classify hypernymy relations between term-pairs, using disributional information on each term, and path-based information, encoded using an LSTM.


Version 2:

Major features and improvements:

Bug fixes:

To reproduce the results reported in the paper, please use V1. The current version acheives similar results - the integrated model's performance on the randomly split dataset is: Precision: 0.918, Recall: 0.907, F1: 0.912


Consider using our new project, LexNET! It supports classification of multiple semantic relations, and contains several model enhancements and detailed documentation.


Prerequisites:

Quick Start:

The repository contains the following directories:

To create a processed corpus, download a Wikipedia dump, and run:

bash create_resource_from_corpus.sh [wiki_dump_file] [resource_prefix]

Where resource_prefix is the file path and prefix of the corpus files, e.g. corpus/wiki, such that the directory corpus will eventually contain the wiki_*.db files created by this script.

To train the integrated model, run:

train_integrated.py [resource_prefix] [dataset_prefix] [model_prefix_file] [embeddings_file] [alpha] [word_dropout_rate]

Where:

Similarly, you can train the path-based model with train_path_based.py or test any of these pre-trained models using test_integrated.py and test_path_based.py respectively.