Closed ambroser53 closed 2 months ago
cc @younesbelkada @pacman100
Hi ! This makes sense yes, can you open a PR with the suggested changes ? 🙏
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Feature request
I have multi-modal models with multiple different peft models on different submodules due to requiring different LoRA configurations. The following is an example:
This works fine until you quantise the model as the huggingface trainer requires that the model pass the
_is_peft_model
check when it's quantised or it assumes the entire model is quantised and therefore untrainable. A really simple fix would just to be to change it from a check with_is_peft_model
to just seeing if there are any parameters that are unquantised AND of a trainable datatype (>=16bit) but if it really should stick to being a peft model check it could just check if there are any submodules which are peft models. I think in general, allowing functionality to generalise irrespective of what class or model type the top level model is will just generally be helpful for researchers in the multi-modal space such as myself.I understand that there is PeftMixedModel but it seems largely unsupported and unwieldy compared to just doing it like this where I have complete control.
Motivation
Let me train quantised models that aren't PeftModels at their top level/aren't wrapped as a PeftModel.
Your contribution
Here's a version of
_is_peft_model
that uses param requiring gradient and not being 4 or 8 bitHere is probably a more acceptable version that specifically checks for any PeftModel submodules: