Comprehensive Analysis of Negative Sampling in Knowledge Graph Representation Learning

This repository includes our code used in the following paper (arXiv) accepted at ICML2022:

@misc{https://doi.org/10.48550/arxiv.2206.10140,
  doi = {10.48550/ARXIV.2206.10140},
  url = {https://arxiv.org/abs/2206.10140},
  author = {Kamigaito, Hidetaka and Hayashi, Katsuhiko},
  keywords = {Machine Learning (cs.LG), Artificial Intelligence (cs.AI), Computation and Language (cs.CL), Social and Information Networks (cs.SI), FOS: Computer and information sciences, FOS: Computer and information sciences},
  title = {Comprehensive Analysis of Negative Sampling in Knowledge Graph Representation Learning},
  publisher = {arXiv},
  year = {2022},
  copyright = {arXiv.org perpetual, non-exclusive license}
}

Note that our original paper at PMLR wrongly drops |D| in Eq. (10), (12), and (13) by typos (erratum). Please see the latest arXiv version of our paper to understand our work.

We modified KGE-HAKE (Zhang et al., 2020) and KnowledgeGraphEmbedding (Sun et al., 2019) to implement our code.

Directories

We locate our modified KnowledgeGraphEmbedding on ./KnowledgeGraphEmbedding and KGE-HAKE on ./KGE-HAKE.

Requirements

Python 3.6+
PyTorch 1.0+
NumPy 1.15.4+

You can fulfill these requirements by running the command:

pip install -r requirements.txt

Usages

Reproducing Results

To rerun RESCAL, ComplEx, DistMult, TrasnE, and RotatE:

Move to ./KnowledgeGraphEmbedding
Run setting files at ./KnowledgeGraphEmbedding/settings/ for each model.
After the training, you can test trained models by running ./KnowledgeGraphEmbedding/eval.sh.
- The evaluation results are stored in test.log of each model directory.

To rerun HAKE:

Move to ./KGE-HAKE
Run setting files at ./KGE-HAKE/settings/ for each model.
After the training, you can test trained models by running ./KGE-HAKE/eval.sh.
- The evaluation results are stored in test.log of each model directory.

Training Models

RESCAL, ComplEx, DistMult, TrasnE, and RotatE in `./KnowledgeGraphEmbedding`

You can run the following scripts:

run.sh trains a model using the self-adversarial negative sampling (SANS) loss function.
run_wo_adv.sh trains a model using the NS loss in Eq. (3) in our paper with uniform noise.
run_wo_adv_sum.sh trains a model using the NS loss in Eq. (2) in our paper with uniform noise.

The above scripts conduct testing after the final epoch of the training. Note that this result is on the model obtained through the last training epoch. If you need to evaluate the model that achieved the best validation MRR, please use the method described in Testing Models.

HAKE in `./KGE-HAKE`

You can run the following scripts:

runs.sh trains a model using the self-adversarial negative sampling (SANS) loss function.
runs_wo_adv.sh trains a model using the NS loss in Eq. (3) in our paper with uniform noise.
runs_wo_adv_sum.sh trains a model using the NS loss in Eq. (2) in our paper with uniform noise.

Subsampling

In the training scripts of both ./KnowledgeGraphEmbedding and ./KGE-HAKE, you can use subsampling described in our paper by the following options:

--default_subsampling: The subsampling included in ./KnowledgeGraphEmbedding
--freq_based_subsampling: Frequency-based subsampling described in Eq. (12) of our paper.
--uniq_based_subsampling: Unique-based subsampling described in Eq. (13) of our paper.

Testing Models

RESCAL, ComplEx, DistMult, TrasnE, and RotatE in `./KnowledgeGraphEmbedding`

You can test a trained model in ${MODEL_DIRECTORY} by using the following command:

python -u codes/run.py --do_test --cuda -init ${MODEL_DIRECTORY}

HAKE in `./KGE-HAKE`

You can test a trained model in ${MODEL_DIRECTORY} by using the following command:

python -u codes/runs.py --do_test --cuda -init ${MODEL_DIRECTORY}

Other Details

Other options are described in ./KGE-HAKE/README.md and ./KnowledgeGraphEmbedding/README.md.

kamigaito / ICML2022

readme

Comprehensive Analysis of Negative Sampling in Knowledge Graph Representation Learning

Directories

Requirements

Usages

Reproducing Results

Training Models

RESCAL, ComplEx, DistMult, TrasnE, and RotatE in `./KnowledgeGraphEmbedding`

HAKE in `./KGE-HAKE`

Subsampling

Testing Models

RESCAL, ComplEx, DistMult, TrasnE, and RotatE in `./KnowledgeGraphEmbedding`

HAKE in `./KGE-HAKE`

Other Details

kamigaito / ICML2022

readme

Comprehensive Analysis of Negative Sampling in Knowledge Graph Representation Learning

Directories

Requirements

Usages

Reproducing Results

Training Models

RESCAL, ComplEx, DistMult, TrasnE, and RotatE in ./KnowledgeGraphEmbedding

HAKE in ./KGE-HAKE

Subsampling

Testing Models

RESCAL, ComplEx, DistMult, TrasnE, and RotatE in ./KnowledgeGraphEmbedding

HAKE in ./KGE-HAKE

Other Details

RESCAL, ComplEx, DistMult, TrasnE, and RotatE in `./KnowledgeGraphEmbedding`

HAKE in `./KGE-HAKE`

RESCAL, ComplEx, DistMult, TrasnE, and RotatE in `./KnowledgeGraphEmbedding`

HAKE in `./KGE-HAKE`