Data for the shared task is available at https://github.com/SUDA-HLT/IPRE, and the review paper is available at https://arxiv.org/abs/1908.11337.
We provide a baseline system based on convolutional neural network with selective attention.
Please download the data from the competition website, then unzip files and put them in ./data/
folder.
You can use the following command to train models for Sent-Track or Bag-Track:
python baseline.py --level sent
python baseline.py --level bag
The model will be stored in ./model/
floder. We provide large scale unmarked corpus for train word vectors or language mdoels. The word vectors used in baseline system are trained by a package named gensim in python, and some parameters are set as follows:
from gensim.models import word2vec
model = word2vec.Word2Vec(sentences, sg=1, size=300, window=5, min_count=10, negative=5, sample=1e-4, workers=10)
You can use the following command to test models for Sent-Track or Bag-Track:
python baseline.py --mode test --level sent
python baseline.py --mode test --level bag
Predicted results will be stored in result_sent.txt or result_bag.txt.
We use f1 score as the basic evaluation metric to measure the performance of systems. In our baseline system, we get about 0.22 f1 score in Sent-track and about 0.31 f1 score in Bag-Track by using pre-trained word vectors.