hiyouga / LLaMA-Factory

Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
https://arxiv.org/abs/2403.13372
Apache License 2.0
33.66k stars 4.14k forks source link

Is there a way to do QLORA 8bit for Llama3 70B using 2*A6000? #5364

Closed etemiz closed 2 months ago

etemiz commented 2 months ago

Reminder

System Info

LLaMA-Factory 0.8.3

Reproduction

I used the example here https://github.com/hiyouga/LLaMA-Factory/blob/main/examples/train_qlora/llama3_lora_sft_otfq.yaml Changed quantization_bit: 8 It didn't work because it tries to load both A6000 cards which have 48GB memory until 80GB or so.

How do I divide the training into two GPUs?

And fsdp_qlora with 8 bit gives error: only 4-bit quantized model can use fsdp+qlora or auto device map

Expected behavior

How do I divide the training into two GPUs?

Others

No response

hiyouga commented 2 months ago

fsdp qlora only supports 4 bit quant

etemiz commented 1 month ago

In readme it says QLoRa 70B 8 bit is achiavable with 80GB of VRAM. I have 2*A6000. Is there a way I can do QLoRa 70B 8 bit using my cards?