Open rasbt opened 1 year ago
Can "Accelerate" help? https://github.com/artidoro/qlora/issues/96
Although training speed does not increase, the model can be fit to GPU memories.
AFAIK the problem with quantization and FSDP is with checkpointing, so a first step towards this goal would be to support it for inference. This would also unblock running Falcon 180B
@carmocca hi, do you mean qlora & fsdp can't work, but qlora & ddp can? Or qlora can only work on single GPU?
My comment is quite old now. I do not know what's the status today. Somebody needs to try it out and report back.
Currently, we disabled Multi-GPU support for QLoRA because we didn't test it, yet. Might be worthwhile looking into this some time, so this issue is just to remember to revisit this.