This repository provides the source code & data of our paper: Ruleformer: Context-aware Rule Mining over Knowledge Graph (COLING 2022).
The dataset should contain train.txt, valid.txt, test.txt
in the format of h r t
for each line, entities.txt
listing all the entities shown in KG, relations.txt
listing all the relations shown in KG.
Run the following command to generate subgraph on your dataset:
python transformer/dataset.py -data=DATA -maxN=MAXN -padding=PADDING -jump=JUMP
-data
: string, the relative/absolute path of dataset
-maxN
: integer, filter nodes whose degree exceed it
-padding
: integer, cutting off too long sequence
-jump
: integer, length of rule
python translate.py -data=DATASET/DATA -jump=JUMP -padding=PADDING -batch_size=BATCH_SIZE -desc=DESC
-d_v
, -n_head
, -n_layers
: integer, hyper parameters of transformers
-subgraph
: [OPT] integer, to select another subgraph while train JUMP
hops rule.
python translate.py -data=DATASET/DATA -jump=JUMP -padding=PADDING -batch_size=BATCH_SIZE -desc=DESC -ckpt=CKPT -decode_rule
-ckpt
: string, select which checkpoint to decode rules
-the_rel
: float, relative threshold of the next relation
-the_rel_min
: float, absolute threshold of the next relation
-the_all
: float, absolute threshold of the whole rule
# preprocess
python transformer/dataset.py -data=umls -maxN=40 -padding=140 -jump=3
# train
python translate.py -data=DATASET/umls -jump=3 -padding=140 -batch_size=5 -epoch=50 -n_head=6 -d_v=64 -desc=umls -savestep=5
# decode rule
python translate.py -data=DATASET/umls -jump=3 -padding=140 -batch_size=5 -epoch=50 -n_head=6 -d_v=64 -desc=umls-rule -ckpt=EXPS/umls-j3-mul-XX/TranslatorXX.ckpt -decode_rule
If you find this code useful, please cite the following paper.
@inproceedings{xu-etal-2022-ruleformer,
title = "Ruleformer: Context-aware Rule Mining over Knowledge Graph",
author = "Xu, Zezhong and
Ye, Peng and
Chen, Hui and
Zhao, Meng and
Chen, Huajun and
Zhang, Wen",
booktitle = "Proceedings of the 29th International Conference on Computational Linguistics",
month = oct,
year = "2022",
address = "Gyeongju, Republic of Korea",
publisher = "International Committee on Computational Linguistics",
url = "https://aclanthology.org/2022.coling-1.225",
pages = "2551--2560",
}
We refer to the code of Transformers. Thanks for their contributions.