dataset/webqsp/cached_desc/0.txt not found

How can I resolve these errors? Is it versioning or a missing parameter? The dataset/webqsp/cached_desc/ files do not exist.

Steps:

Generated files dataset/webqsp/graphs/*.pt with python -m src.dataset.preprocess.webqsp
Create missing directory mkdir dataset/webqsp/cached_graphs
Link graphs/ to cached_graphs/ cd dataset/webqsp; ln graphs/* cached_graphs

At this point there is a new error: dataset/webqsp/cached_desc/0.txt and there are not any #.txt files.

Here is the runtime output:

$ python train.py --dataset webqsp --model_name graph_llm --seed 3 inherit model weights from sentence-transformers/all-roberta-large-v1 /data2/_user_name_/anaconda3/envs/grag/lib/python3.9/site-packages/transformers/tokenization_utils_base.py:1601: FutureWarning:clean_up_tokenization_spaceswas not set. It will be set toTrueby default. This behavior will be depracted in transformers v4.45, and will be then set toFalseby default. For more details check this issue: https://github.com/huggingface/transformers/issues/31884 Namespace(model_name='graph_llm', project='projection', seed=3, dataset='webqsp', lr=1e-05, wd=0.05, patience=5, batch_size=2, grad_steps=2, num_epochs=10, warmup_epochs=1, eval_batch_size=16, llm_model_name='7b', llm_model_path='', llm_frozen='True', llm_num_virtual_tokens=10, output_dir='output', max_txt_len=512, max_new_tokens=32, gnn_model_name='gat', gnn_num_layers=4, gnn_in_dim=1024, gnn_hidden_dim=1024, alignment_mlp_layers=3, gnn_num_heads=4, distance_operator='euclidean', gnn_dropout=0.0) Traceback (most recent call last): File "/data2/_user_name_/gits/GRAG/train.py", line 139, in <module> main(args) File "/data2/_user_name_/gits/GRAG/train.py", line 34, in main train_dataset = [dataset[i] for i in idx_split['train']] File "/data2/_user_name_/gits/GRAG/train.py", line 34, in <listcomp> train_dataset = [dataset[i] for i in idx_split['train']] File "/data2/_user_name_/gits/GRAG/src/dataset/webqsp.py", line 41, in __getitem__ desc = open(f'{cached_desc}/{index}.txt', 'r').read() FileNotFoundError: [Errno 2] No such file or directory: 'dataset/webqsp/cached_desc/0.txt'

HuieL / GRAG

dataset/webqsp/cached_desc/0.txt not found #4