Closed MINJIK01 closed 7 months ago
Hi @MINJIK01,
Great thanks for your interst in our project.
The google drive link provided in the repo (https://drive.google.com/file/d/1fRXdCMHpkb1-kuzcxgZPKkILEWBSbW4M) contains training/validation/test set for each of the 4 datasets. Note that train/val/test set graphs are stored in same files and the split lies in the file {dataset_name}/{dataset_name}_split.pkl
. The split is done automatically in the train_graph_llm.py
as follows:
# Line 48
dataset, split, edge_index = load_dataset[args.dataset]()
...
# Line 87-90
train_dataset = clm_dataset_train.select(split['train'])
val_dataset = clm_dataset_train.select(split['valid'])
val_dataset_eval = clm_dataset_test.select(split['valid'])
test_dataset = clm_dataset_test.select(split['test'])
I hope this response can address your question and please feel free to ask if you have any further questions.
onse can address your question and please feel free to ask if you have a
Thanks a lot :) I understood.
Hello, first of all, thank you for your interesting project. I am simply wondering if there are only training datasets linked in your GitHub repository.