Load qwen2.5-32b on 4 gpu

unslothai / unsloth

Finetune Llama 3.2, Mistral, Phi, Qwen 2.5 & Gemma LLMs 2-5x faster with 80% less memory

https://unsloth.ai

Apache License 2.0

18.41k stars 1.29k forks source link

Load qwen2.5-32b on 4 gpu #1131

Closed luoruijie closed 1 month ago

luoruijie commented 1 month ago

Hi ,i want to know how to use unsloth to load qwen2.5-32b-instruct(fp16) on 4 rtx4090 24G?

and my code is below ,may be add some parameters can solve my question?

import time from unsloth import FastLanguageModel import torch

dtype = None

model, tokenizer = FastLanguageModel.from_pretrained( model_name = "unsloth/Qwen2.5-32B-Instruct", max_seq_length = 8196, dtype = dtype, load_in_4bit =False, device_map="auto", )

danielhanchen commented 1 month ago

Currently sorry Unsloth does not yet multiple GPU setups - we're finalizing a community beta program, so it should be in a more stable form for everyone soon!

luoruijie commented 1 month ago

OK,thanks reply