huggingface / peft

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
https://huggingface.co/docs/peft
Apache License 2.0
14.89k stars 1.42k forks source link

Geometric Parametrization (GmP) [CLIP] - is it compatible with PEFT? #1863

Open zer0int opened 2 weeks ago

zer0int commented 2 weeks ago

Feature request

Support for GmP (this = as applied to the structure (naming) of the original OpenAI/CLIP model), i.e.:

"Normal" CLIP MLP (multi-layer perceptron):

(mlp): Sequential(
  |-(c_fc): Linear(in_features=1024, out_features=4096, bias=True)
  | (gelu): QuickGELU()
|-}-(c_proj): Linear(in_features=4096, out_features=1024, bias=True)
| | 
| |-- visual.transformer.resblocks.0.mlp.c_fc.weight
| |-- visual.transformer.resblocks.0.mlp.c_fc.bias
|
|---- visual.transformer.resblocks.0.mlp.c_proj.weight
|---- visual.transformer.resblocks.0.mlp.c_proj.bias

GmP CLIP MLP:

Weight decomposition into:
- radial component 'r' as norm of pre-trained weights
- angular component 'theta' as normalized direction
-> preserves weight vectors' directionality and magnitude

(mlp): Sequential(
  |-(c_fc): GeometricLinear()
  | (gelu): QuickGELU()
|-}-(c_proj): GeometricLinear()
| | 
| |-- visual.transformer.resblocks.0.mlp.c_fc.r
| |-- visual.transformer.resblocks.0.mlp.c_fc.theta
| |-- visual.transformer.resblocks.0.mlp.c_fc.bias
|
|---- visual.transformer.resblocks.0.mlp.c_proj.r
|---- visual.transformer.resblocks.0.mlp.c_proj.theta
|---- visual.transformer.resblocks.0.mlp.c_proj.bias

(Same thing for [text] transformer.resblocks)

Motivation

I have had excellent results with GmP and CLIP ViT-L/14 (full fine-tune), CoCo-40k, batch_size=36 (!!!), boosting ImageNet/ObjectNet accuracy from ~0.84 (original OpenAI/CLIP ViT-L/14) to, most recently, >0.90 using GmP:

finetune-results finetune-stimulation-results

However, I am not sure if this could possibly even work without updating all weights during fine-tuning. I'd be delighted to know, so I could decide whether or not to try and pursue this project further - or if I should rather "pursue" a cloud computing instance and just train the full model. ;-)

Your contribution

I have modified the "laion/CLIP-ViT-bigG-14-laion2B-39B-b160k" with GmP, but got:

ValueError: Target module GeometricLinear() is not supported. Currently, only the following modules are supported: `torch.nn.Linear`, `torch.nn.Embedding`, `torch.nn.Conv2d`, `transformers.pytorch_utils.Conv1D`.

I have a working implementation of GmP for FULL model finetunes using the original OpenAI/CLIP. Entire code to reproduce my GmP modification + finetune / results is available here: https://github.com/zer0int/CLIP-fine-tune

BenjaminBossan commented 2 weeks ago

So IIUC, GeometricLinear replaces what would normally be a nn.Linear module and your goal would be to apply LoRA to this layer. As you observed, this is unfortunately not possible. Each layer type needs to be explicitly implemented in LoRA for it to be supported.