Source code and datasets for IJCAI-2018 paper "Bootstrapping Entity Alignment with Knowledge Graph Embedding".
We use two datasets, namely DBP15K and DWY100K. DBP15K can be found here while DWY100K is as follows.
Folder "dataset/DWY100K/" contains the id files of DWY100K.
The subfolder "mapping/0_3" contains the id files used in BootEA and MTransE while the subfolder "sharing/0_3" is for JAPE and IPTransE. The two datasets use 30% reference entity alignment as seeds. Id files in "sharing/0_3" are generated following the idea of parameter sharing that lets the two aligned enitites in seed alignment share the same id, while "mapping/0_3" does not.
The subfolder "mapping/0_3" inculdes the following files:
The subfolder "sharing/0_3" inculdes the following additional files:
File "dataset/DWY100K_raw_data.zip" is the raw data of DWY100K, where each entity, relation or attribute is represented by a URI. Each dataset has the following files:
Folder "code" contains all codes of BootEA, in which:
If you fail to install Graph-tool, we suggest you to set "self.heuristic = False" in param.py, which allows BootEA to run using igraph rather than Graph-tool. If you have trouble installing igraph, you can use NetworkX by modifying the code of line 186-189 in train_bp.py and replacing "mwgm_graph_tool" and "mwgm_igraph" with "mwgm_networkx". Note that, igraph and NetworkX are much slower than Graph-tool!
If you have any difficulty or question in running code and reproducing experiment results, please email to zqsun.nju@gmail.com and whu@nju.edu.cn.
If you use this model or code, please cite it as follows:
@inproceedings{BootEA,
author = {Zequn Sun and Wei Hu and Qingheng Zhang and Yuzhong Qu},
title = {Bootstrapping Entity Alignment with Knowledge Graph Embedding},
booktitle = {IJCAI},
pages = {4396--4402},
year = {2018}
}
The following links point to some recent work that uses our datasets: