xwhan / DeepPath

code and docs for my EMNLP paper "DeepPath: A Reinforcement Learning Method for Knowledge Graph Reasoning"
533 stars 135 forks source link
emnlp2017 knowledge-graph reasoning tensorflow

Deep Reinforcement Learning for Knowledge Graph Reasoning

We study the problem of learning to reason in large scale knowledge graphs (KGs). More specifically, we describe a novel reinforcement learning framework for learning multi-hop relational paths: we use a policy-based agent with continuous states based on knowledge graph embeddings, which reasons in a KG vector-space by sampling the most promising relation to extend its path. In contrast to prior work, our approach includes a reward function that takes the accuravy, diversity, and efficiency into consideration. Experimentally, we show that our proposed method outperforms a path-ranking based algorithm and knowledge graph embedding methods on Freebase and Never-Ending Language Learning datasets.

Access the dataset

Download the knowledge graph dataset NELL-995 FB15k-237

How to run our code

  1. unzip the data, put the data folder in the code directory
  2. run the following scripts within scripts/

    • ./pathfinder.sh ${relation_name} # find the reasoning paths, this is RL training, it might take sometime
    • ./fact_prediction_eval.py ${relation_name} # calculate & print the fact prediction results
    • ./link_prediction_eval.sh ${relation_name} # calculate & print the link prediction results

    Examples (the relation_name can be found in NELL-995/tasks/):

    • ./pathfinder.sh concept_athletehomestadium
    • ./fact_prediction_eval.py concept_athletehomestadium
    • ./link_prediction_eval.sh concept_athletehomestadium
  3. Since we already put the reasoning paths in the dataset, you can directly run fact_prediction_eval.py or link_prediction_eval.sh to get the final results for each reasoning task

Format of the dataset

  1. raw.kb: the raw kb data from NELL system
  2. kb_env_rl.txt: we add inverse triples of all triples in raw.kb, this file is used as the KG for reasoning
  3. entity2vec.bern/relation2vec.bern: transE embeddings to represent out RL states, can be trained using TransX implementations by thunlp
  4. tasks/: each task is a particular reasoning relation
    • tasks/${relation}/*.vec: trained TransH Embeddings
    • tasks/${relation}/*.vec_D: trained TransD Embeddings
    • tasks/${relation}/*.bern: trained TransR Embedding trained
    • tasks/${relation}/*.unif: trained TransE Embeddings
    • tasks/${relation}/transX: triples used to train the KB embeddings
    • tasks/${relation}/train.pairs: train triples in the PRA format
    • tasks/${relation}/test.pairs: test triples in the PRA format
    • tasks/${relation}/path_to_use.txt: reasoning paths found the RL agent
    • tasks/${relation}/path_stats.txt: path frequency of randomised BFS

If you use our code, please cite the paper

@InProceedings{wenhan_emnlp2017,
  author    = {Xiong, Wenhan and Hoang, Thien and Wang, William Yang},
  title     = {DeepPath: A Reinforcement Learning Method for Knowledge Graph Reasoning},
  booktitle = {Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP 2017)},
  month     = {September},
  year      = {2017},
  address   = {Copenhagen, Denmark},
  publisher = {ACL}
}

Acknowledgement