Closed payne4handsome closed 1 year ago
The cause of this exception is out of cpu memory. I set param low_cpu_mem_usage=True in function from_pretrained() resolve this. I close this question.
The cause of this exception is out of cpu memory. I set param low_cpu_mem_usage=True in function from_pretrained() resolve this. I close this question.
How do you set the low_cpu_mem_usage=True? Can you show the code? Thank you very much!
@yytzsy If you train with deepspeed with stage 2, the code likes belows.
model = LlavaLlamaForCausalLM.from_pretrained(
model_args.model_name_or_path,
cache_dir=training_args.cache_dir,
low_cpu_mem_usage=True,
**bnb_model_from_pretrained_args
)
If you use deepspeed with stage 3, you don't need do this and will train correctly.
There is a contradiction point, when I use zero3_offload.json It reported
exits with return code = -9
use zero2.json, it reported OOM。 I don't know how to solve it
The cause of this exception is out of cpu memory. I set param low_cpu_mem_usage=True in function from_pretrained() resolve this. I close this question.
I have set the low_cpu_mem_usage=True, but the issue is still exists, what to do next ? Thanks.
Describe the issue
Issue: Hi @haotian-liu , help me. I have download llama-2 weights、 llava-150k、pretrain_mm_mlp_adapter. I just want to test the correctness of the program . But program exit exception with nothing useful information output.
I train LLaVA utilizing 8 Nvidia 3090(24G) gpus.
Command:
Log:
Screenshots: my finetune_full_schedule.sh likes below.