Fantabulous-J / CLASS

0 stars 0 forks source link

Test Reader Running is error #1

Open hungptit123 opened 5 months ago

hungptit123 commented 5 months ago

Script Run: python test_reader.py \ --model_name_or_path ${MODEL_PATH}/checkpoint-best \ --output_dir ${MODEL_PATH} \ --output_path dev_xor_retrieve_results.json \ --train_dir ${DATA_PATH} \ --train_path xor_train_retrieve_eng_span.jsonl \ --query_file xor_dev_retrieve_eng_span_v1_1.jsonl \ --corpus_file psgs_w100.tsv \ --tf32 True \ --per_device_train_batch_size 4 \ --per_device_eval_batch_size 512 \ --gradient_accumulation_steps 1 \ --negatives_x_device \ --grad_cache \ --refresh_passages \ --refresh_intervals 1000 \ --separate_joint_encoding \ --de_avg_pooling \ --gradient_checkpointing \ --gc_chunk_size 8 \ --retriever_weight 8 \ --multi_task \ --ddp_find_unused_parameters False \ --train_n_passages 100 \ --max_query_length 50 \ --max_passage_length 200 \ --max_query_passage_length 250 \ --max_answer_length 50 \ --learning_rate 5e-5 \ --max_steps 6000 \ --num_train_epochs 1 \ --distillation_start_steps 0 \ --weight_decay 0.01 \ --dataloader_num_workers 1 \ --print_steps 20

Log Error: Original Traceback (most recent call last): File "/home/os_callbot/miniconda3/envs/llm_class/lib/python3.10/site-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop data = fetcher.fetch(index) File "/home/os_callbot/miniconda3/envs/llm_class/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 58, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/os_callbot/miniconda3/envs/llm_class/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 58, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/os_callbot/workspace/hungdv/CLASS/dataloader.py", line 208, in getitem qid, pids = example['qid'], example['pids'] KeyError: 'qid'

-> I see dataset but I can't see key "qid", "pids" in data

Fantabulous-J commented 5 months ago

Thanks for your interest in our work. I have added the scripts for reader evaluation. You can find them in here for XOR-Retrieve and here for XOR-Full.

The dev_xor_retrieve_pids.jsonl file should be the result from the retrieval step.