Closed etemiz closed 2 months ago
LLaMA-Factory 0.8.3
I used the example here https://github.com/hiyouga/LLaMA-Factory/blob/main/examples/train_qlora/llama3_lora_sft_otfq.yaml Changed quantization_bit: 8 It didn't work because it tries to load both A6000 cards which have 48GB memory until 80GB or so.
How do I divide the training into two GPUs?
And fsdp_qlora with 8 bit gives error: only 4-bit quantized model can use fsdp+qlora or auto device map
No response
fsdp qlora only supports 4 bit quant
In readme it says QLoRa 70B 8 bit is achiavable with 80GB of VRAM. I have 2*A6000. Is there a way I can do QLoRa 70B 8 bit using my cards?
Reminder
System Info
LLaMA-Factory 0.8.3
Reproduction
I used the example here https://github.com/hiyouga/LLaMA-Factory/blob/main/examples/train_qlora/llama3_lora_sft_otfq.yaml Changed quantization_bit: 8 It didn't work because it tries to load both A6000 cards which have 48GB memory until 80GB or so.
How do I divide the training into two GPUs?
And fsdp_qlora with 8 bit gives error: only 4-bit quantized model can use fsdp+qlora or auto device map
Expected behavior
How do I divide the training into two GPUs?
Others
No response