Open Robinysh opened 2 weeks ago
We'll be adding all model support in a future release which will enable Unsloth GC for other models! Unsure on normal full finetuning or pretraining - I would suggest using Deepspeed to offload stuff, and not Unsloth
Great to know its on the todo list. I'm not looking for offloading techniques as the performance drop is quite significant, I'm rather trying to do gradient checkpointing during pretraining. The pytorch implementation should be good enough for the time being.
Currently unsloth offers a customized version of gradient checkpointing that claims to be better. The only way I'm aware of using it is with the below code.
But using
FastLanguageModel.get_peft_model
will patch the model with LoRA. Is there any way to use the unsloth customized gradient checkpointing without LoRA? Or does it even make sense to use it without? Are the customized tricks specific to pefts?