Open SparkJiao opened 1 year ago
Wonderful work!
May I know the compatibility with ZeRO mechanism? E.g., Torch redundancy optimizer, deepspeed zero-1 to zero-3, and fairscale FSDP. Becaused I noticed that QLoRA relies on particularly implemented optimizer.
If the optimizer is not compabitible with the tools mentioned above, can I use only 4-bit tuning and lora with zero mechanism? Will this cause more memory cost?
Thanks very much!
Best
I did try this a few days ago but I think currently deepspeed still cannot support 4 bit training (backpropagation)
Wonderful work!
May I know the compatibility with ZeRO mechanism? E.g., Torch redundancy optimizer, deepspeed zero-1 to zero-3, and fairscale FSDP. Becaused I noticed that QLoRA relies on particularly implemented optimizer.
If the optimizer is not compabitible with the tools mentioned above, can I use only 4-bit tuning and lora with zero mechanism? Will this cause more memory cost?
Thanks very much!
Best