This repository contains the files for our Event Nugget Detection systems that was submitted to the TAC 2015 shared task on Event Nugget Detection. It is described in the paper Event Nugget Detection, Classification and Coreference Resolution using Deep Neural Networks and Gradient Boosted Decision Trees.
We implemented a feed-forward network following the approach of Collobert et al., 'NLP (almost) from scratch' and trained it on the provided data.
In case you like the work, please cite the following paper:
@inproceedings{ reimers-gurevych:2015:TAC,
author = {Nils Reimers and Iryna Gurevych},
title = {Event Nugget Detection, Classification and Coreference Resolution using
Deep Neural Networks and Gradient Boosted Decision Trees},
month = {November},
year = {2015},
booktitle = {Proceedings of the Eight Text Analysis Conference (TAC 2015) (to appear)},
editor = {National Institute of Standards and Technology (NIST)},
location = {Gaithersburg, Maryland, USA},
research_area = {Ubiquitous Knowledge Processing},
url = {},
Abstract: For the shared task of event nugget detection at TAC 2015 we trained a deep feed forward network achieving an official F1-score of 65.31% for plain annotations, 55.56% for event mention type and 49.16% for the realis value. For the task of Event Coreference Resolution we prototyped a simple baseline using Gradient Boosted Decision Trees achieving an overall average CoNLL score of 70.02%.
Contact Person: Nils Reimers
This repository contains experimental software and is published for the sole purpose of giving additional background details on the respective publication.
folder. Make sure vocab/deps.words
exists. cat padding_unknown_300d.txt >> deps.words
in the vocab
folder.To train your own models, execute
. Given the config/config.txt file, this script trains a new models based on the train, development, and test files in the tacdata folder.
To execute the pre-trained model, run the script
. This file reads in the input.txt files and adds event annotations using a BIO enconding. The output is stored in the output.txt file.
requires Stanford CoreNLP. The jars must be saved in the corenlp folder.
This code is published under GPL version 3 or later.