How can I adjust the model precision to float16? I'm unable to run a float32 model on a single 4090

OpenRobotLab / PointLLM

[ECCV 2024] PointLLM: Empowering Large Language Models to Understand Point Clouds

450 stars 22 forks source link

How can I adjust the model precision to float16? I'm unable to run a float32 model on a single 4090 #28

Closed 870237499 closed 1 month ago

870237499 commented 1 month ago

I currently have 4 RTX 4090 GPUs, each with 24GB of memory. However, I encounter an out of memory error when running a 7B model. I changed model: PointLLMLlamaForCausalLM.from_pretrained(model_name, low_cpu_mem_usage=True, use_cache=True, torch_dtype=torch.float16).cuda() to use dtype=float16, and it runs, but during inference, I get an error: [ERROR] Input type (float) and bias type (c10::Half) I am seeking your help, thank you.

RunsenXu commented 1 month ago

Please give the full functions stack when this error is raised, so I can help you with more information.