SafeAILab / EAGLE

Official Implementation of EAGLE-1 (ICML'24) and EAGLE-2 (EMNLP'24)
https://arxiv.org/pdf/2406.16858
Apache License 2.0
826 stars 81 forks source link

CUDA out of memory #13

Closed pengfeiwu1999 closed 10 months ago

pengfeiwu1999 commented 11 months ago

when I run the script as python ge_data_all_vicuna.py --start=0 --end=17000 --index=0 --gpu_index 0 --outdir ./data_generated/sharegpt_0_67999_mufp16 , it will occur the "cuda out of memory", I use a single A100 gpu with 80G memory, and I use the model vicuna-13B, I checked the data and it seems that the longest input_id is 20000+, it undoubtly exceed the memory limit, so how do you generate the data?

Liyuhui-12 commented 11 months ago

Line 83 of ge_data_all_vicuna.py ensures that the sequence length does not exceed max_length=tokenizer.model_max_length. If you have correctly used vicuna, tokenizer.model_max_length should be 2048 (as seen in https://huggingface.co/lmsys/vicuna-13b-v1.3/blob/main/config.json). An alternative solution is to manually set max_length, but we recommend that you check if there is an issue with the vicuna weight folder.