kamigaito / ICML2022

10 stars 0 forks source link

Comprehensive Analysis of Negative Sampling in Knowledge Graph Representation Learning

This repository includes our code used in the following paper (arXiv) accepted at ICML2022:

@misc{https://doi.org/10.48550/arxiv.2206.10140,
  doi = {10.48550/ARXIV.2206.10140},
  url = {https://arxiv.org/abs/2206.10140},
  author = {Kamigaito, Hidetaka and Hayashi, Katsuhiko},
  keywords = {Machine Learning (cs.LG), Artificial Intelligence (cs.AI), Computation and Language (cs.CL), Social and Information Networks (cs.SI), FOS: Computer and information sciences, FOS: Computer and information sciences},
  title = {Comprehensive Analysis of Negative Sampling in Knowledge Graph Representation Learning},
  publisher = {arXiv},
  year = {2022},
  copyright = {arXiv.org perpetual, non-exclusive license}
}

Note that our original paper at PMLR wrongly drops |D| in Eq. (10), (12), and (13) by typos (erratum). Please see the latest arXiv version of our paper to understand our work.

We modified KGE-HAKE (Zhang et al., 2020) and KnowledgeGraphEmbedding (Sun et al., 2019) to implement our code.

Directories

We locate our modified KnowledgeGraphEmbedding on ./KnowledgeGraphEmbedding and KGE-HAKE on ./KGE-HAKE.

Requirements

You can fulfill these requirements by running the command:

pip install -r requirements.txt

Usages

Reproducing Results

To rerun RESCAL, ComplEx, DistMult, TrasnE, and RotatE:

  1. Move to ./KnowledgeGraphEmbedding
  2. Run setting files at ./KnowledgeGraphEmbedding/settings/ for each model.
  3. After the training, you can test trained models by running ./KnowledgeGraphEmbedding/eval.sh.
    • The evaluation results are stored in test.log of each model directory.

To rerun HAKE:

  1. Move to ./KGE-HAKE
  2. Run setting files at ./KGE-HAKE/settings/ for each model.
  3. After the training, you can test trained models by running ./KGE-HAKE/eval.sh.
    • The evaluation results are stored in test.log of each model directory.

Training Models

RESCAL, ComplEx, DistMult, TrasnE, and RotatE in ./KnowledgeGraphEmbedding

You can run the following scripts:

The above scripts conduct testing after the final epoch of the training. Note that this result is on the model obtained through the last training epoch. If you need to evaluate the model that achieved the best validation MRR, please use the method described in Testing Models.

HAKE in ./KGE-HAKE

You can run the following scripts:

The above scripts conduct testing after the final epoch of the training. Note that this result is on the model obtained through the last training epoch. If you need to evaluate the model that achieved the best validation MRR, please use the method described in Testing Models.

Subsampling

In the training scripts of both ./KnowledgeGraphEmbedding and ./KGE-HAKE, you can use subsampling described in our paper by the following options:

Testing Models

RESCAL, ComplEx, DistMult, TrasnE, and RotatE in ./KnowledgeGraphEmbedding

You can test a trained model in ${MODEL_DIRECTORY} by using the following command:

python -u codes/run.py --do_test --cuda -init ${MODEL_DIRECTORY}

HAKE in ./KGE-HAKE

You can test a trained model in ${MODEL_DIRECTORY} by using the following command:

python -u codes/runs.py --do_test --cuda -init ${MODEL_DIRECTORY}

Other Details

Other options are described in ./KGE-HAKE/README.md and ./KnowledgeGraphEmbedding/README.md.