This repository contains the source code and dataset for the paper: Revisiting the Negative Data of Distantly Supervised Relation Extraction. Chenhao Xie, Jiaqing Liang, Jingping Liu, Chengsong Huang, Wenhao Huang, Yanghua Xiao. ACL 2021. paper
Install all the dependencies in requirements.txt
.
Download the BERT-related files and follow the instructions in tfhub/*/readme.md
Runrere/bert-to-h5.py
to producebert_uncased.h5
and chinese_roberta_wwm_ext.h5
.
The models: ReRe and ReRe_LSTM, in table 3 are provided for reproducing in the dictionary rere
and rere_lstm
.
extraction.py
is the main file.
If you want to train the model, you may use cmd python extraction.py {data_set_name} train
.
You can also load the model and predict by the cmd python extraction.py {data_set_name}
,for example python extraction.py NYT11-HRL
. We can provide the pre-trained model for reproducing exactly the same result as in the paper.
The data set in Figure 3 are provided in data/FNexp
.The data sets are generated by data/FN_data_gen.py
.
You can use the cmdpython extraction.py FNexp/{data_set_name}@{radio} train
, for example python extraction.py FNexp/ske2019@0.1
,to train the corresponding model.
Datasets are provided separately in this repo. Including two new datasets NYT21 and SKE21 (the labeled testset of SKE2019).
The package bert4keras
that we provided in ./rere/BERT_TF2
can alternatively be installed via pip, but we don't guarantee that its latest version works with our code, if trouble happens, please run pip uninstall bert4keras
.
If pretrained the models are needed for reproduce, please contact the authors. We are willing to provide them.
NVIDIA-SMI 455.23.04
Driver Version: 455.23.04
CUDA Version: 11.1
GeForce RTX 3090
Python 3.7.9
requirements.txt
are provided for installing the virtual environment in conda.
@inproceedings{xie2021revisiting,
title={Revisiting the Negative Data of Distantly Supervised Relation Extraction},
author={Xie, Chenhao and Liang, Jiaqing and Liu, Jingping and Huang, Chengsong and Huang, Wenhao and Xiao, Yanghua},
booktitle={Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics},
year={2021}
}