This is the implementation for the paper GraphLLM: Boosting Graph Reasoning Ability of Large Language Model.
conda create -n graph_llm python=3.10 -y
pip install -r requirements.txt
ln -s /folder/of/LLaMA-2-7B/checkpoint ./LLaMA-7B-2
ln -s /folder/of/LLaMA-2-7B/tokenizer ./Llama-2-7b-hf
/folder/of/LLaMA-2-7B/checkpoint
and /folder/of/LLaMA-2-7B/tokenizer
with actual directories!unzip dataset.zip -d ./dataset
The directory structure should be:
.
|- LLaMA-7B-2
| |- params.json
| |- consolidated.00.pth
|
|- Llama-2-7b-hf
| |- tokenizer.model
|
|- dataset
|- sc
|- mts
|- sp
|- bgm
Train and evaluate the model with default settings on graph reasoning datasets on GPU 0:
./scripts/sc.sh
./scripts/mts.sh
./scripts/sp.sh
./scripts/bgm.sh
More hyperparameter settings are at config.py
Hyperparameter explanation:
--n_encoder_layers
number of transformer layers of textual encoder--n_decoder_layers
number of transformer layers of textual decoder--n_mp_layers
number of graph transformer layers--adapter_dim
hidden dimension of textual encoder/decoder and graph transformer--adapter_len
number of prefix tokens per LLM layer--rrwp
graph positional encoding dimension--batch_size
batch size in memory during training--grad_steps
grad_step $\times$ batch_size = batch size for optimization--lr
the learning rate--num_epochs
number of training epochs--warmup_epochs
number of linear warmup epochs--wd
weight decay