FasterDecoding / REST

REST: Retrieval-Based Speculative Decoding, NAACL 2024
Apache License 2.0
163 stars 10 forks source link

OSError: failed to fill whole buffer #19

Open Siegfried-qgf opened 2 hours ago

Siegfried-qgf commented 2 hours ago

when I initialize draftretriever.Reader, I meet this error.

python3 gen_model_answer_rest.py loading the datastore ... Traceback (most recent call last): File "/mnt/gefei/REST/llm_judge/gen_model_answer_rest.py", line 493, in run_eval( File "/mnt/gefei/REST/llm_judge/gen_model_answer_rest.py", line 135, in run_eval datastore = draftretriever.Reader( File "/root/anaconda3/envs/rest/lib/python3.9/site-packages/draftretriever/init.py", line 43, in init self.reader = draftretriever.Reader( OSError: failed to fill whole buffer

zhenyuhe00 commented 2 hours ago

Hi, I've not encountered this error before. I wonder if you've fully built the datastore without any interruptions.

Siegfried-qgf commented 2 hours ago

I check the datastore and find a segmentation fault Namespace(model_path='/mnt/tianlian/deployment/llm_task_flows/model_original/hugging_face_finetune/Qwen2.5-14B-Instruct', large_datastore=False) number of samples: 68623 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 68623/68623 [04:13<00:00, 271.09it/s] [1] 32657 segmentation fault (core dumped) python3 get_datastore_chat.py

Siegfried-qgf commented 1 hour ago

when I limit the num of dataset=100 ,it's ok. but when the num of dataset=2500, it's error. python3 get_datastore_chat.py Namespace(model_path='/mnt/tianlian/deployment/llm_task_flows/model_original/hugging_face_finetune/Qwen2.5-14B-Instruct', large_datastore=False) number of samples: 100 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:00<00:00, 342.49it/s]

╭─   /mnt/gefei/REST/datastore   main !2 ?3 ··················································································································································································································································································  5s  rest root@4514c07d970b  12:47:36 ╰─❯ python3 get_datastore_chat.py Namespace(model_path='/mnt/tianlian/deployment/llm_task_flows/model_original/hugging_face_finetune/Qwen2.5-14B-Instruct', large_datastore=False) number of samples: 2500 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2500/2500 [00:08<00:00, 307.74it/s] [1] 56767 segmentation fault (core dumped) python3 get_datastore_chat.py

Could this be related to how my image was created?

zhenyuhe00 commented 1 hour ago

Hi, I suppose it's because the vocab size of Qwen2.5 is 151936, which exceeds the range of u16 as I manually set in the DraftRetriever. To fix the issue, you may change this line in writer from self.index_file.write_u16::<LittleEndian>(item as u16)?; to self.index_file.write_u32::<LittleEndian>(item as u32)?; Besides, change this line this line in Reader from let int = LittleEndian::read_u16(&data_u8[i..i+2]) as i32; to let int = LittleEndian::read_u32(&data_u8[i..i+4]) as i32;

Hope these changes may fix the bug. If you have any further questions, please feel free to contact me.