Know2BIO is a comprehensive biomedical knowledge graph harmonizing heterogeneous database sources.
We recommend using Anaconda3 to manage the environment.
env.yaml
: set $USER_PATH to user's directory.know2bio
environment using conda env create -f env.yml
.Environment Setup
Section.main.py
script. Arguments are listed below.
usage: main.py [-h] [--dataset {ontology,instance,whole,FB15K,WN,WN18RR,FB237,YAGO3-10}]
[--model {TransE,TransR,DistMult,CP,MurE,RotE,RefE,AttE,RotH,RefH,AttH,ComplEx,RotatE}] [--regularizer {N3,F2}] [--reg REG]
[--optimizer {Adagrad,Adam,SparseAdam}] [--max_epochs MAX_EPOCHS] [--patience PATIENCE] [--valid VALID] [--rank RANK] [--batch_size BATCH_SIZE] [--neg_sample_size NEG_SAMPLE_SIZE]
[--init_size INIT_SIZE] [--learning_rate LEARNING_RATE]
Knowledge Graph Embedding
options: -h, --help show this help message and exit --dataset {ontology,instance,whole} Knowledge Graph dataset: ontology, instance, whole views --model {TransE,TransR,DistMult,CP,MurE,RotE,RefE,AttE,RotH,RefH,AttH,ComplEx,RotatE} Knowledge Graph embedding model --optimizer {Adagrad,Adam,SparseAdam} Optimizer --max_epochs MAX_EPOCHS Maximum number of epochs to train for --patience PATIENCE Number of epochs before early stopping --valid VALID Number of epochs before validation --rank RANK Embedding dimension --batch_size BATCH_SIZE Batch size --neg_sample_size NEG_SAMPLE_SIZE Negative sample size, -1 to not use negative sampling --dropout DROPOUT Dropout rate --init_size INIT_SIZE Initial embeddings' scale --learning_rate LEARNING_RATE Learning rate
- Example: Train TransE model on Know2BIO's whole view
```bash
CUDA_VISIBLE_DEVICES=0 python main.py --model TransE --dataset whole --valid 10 --patience 5 --rank 512 --neg_sample_size 150 --optimizer Adam --learning_rate 0.001
Code and README for the benchmarking Know2BIO can be found in benchmark.
Code and README for the construction of Know2BIO can be found in dataset.