Open bash-j opened 3 months ago
That's an interesting idea! I am not sure what would happen, as converting from weight matrix to theta and r affects all parameters, so what happens to those that are frozen? It may be fine, though, because the learned features are preserved in this modification - I am just not sure what happens if you train like that and some parameters can't update. But I'd say it's possible that it will be just the same as with a weight matrix and a large number of parameters frozen (except better model training ensues with GmP!).
And yeah, they renamed the model structure (same as with HF openai/clip which, in addition to all the attaching-of-ViT and stuff, I also had to accomplish in the conversion scripts we discussed last time).
If you do a diff of these files:
orclip/modeloriginal.py gmpclip/model.py
...You can tell ChatGPT exactly what changed, and then tell it to implement these changes but applied in the syntax for peft. Make sure you provide the entire class to the LLM, but explicitly point out the function that has been altered within. GPT-4o should be able to apply that to whatever naming peft uses.
To state it upfront: I probably won't have time to look into this before the weekend, but please do give me an update (no matter if it's a problem or a success)! I'm curious how that will turn out. Who knows, maybe it is a way to train BIG-G after all? 😄
Sad update for now :(
peft_model = get_peft_model(modified_clip_model, peft_config)
File "C:\Users\zer0int\AppData\Roaming\Python\Python310\site-packages\peft\mapping.py", line 149, in get_peft_model
return MODEL_TYPE_TO_PEFT_MODEL_MAPPING[peft_config.task_type](model, peft_config, adapter_name=adapter_name)
File "C:\Users\zer0int\AppData\Roaming\Python\Python310\site-packages\peft\peft_model.py", line 1170, in __init__
super().__init__(model, peft_config, adapter_name)
File "C:\Users\zer0int\AppData\Roaming\Python\Python310\site-packages\peft\peft_model.py", line 138, in __init__
self.base_model = cls(model, {adapter_name: peft_config}, adapter_name)
File "C:\Users\zer0int\AppData\Roaming\Python\Python310\site-packages\peft\tuners\lora\model.py", line 139, in __init__
super().__init__(model, config, adapter_name)
File "C:\Users\zer0int\AppData\Roaming\Python\Python310\site-packages\peft\tuners\tuners_utils.py", line 166, in __init__
self.inject_adapter(self.model, adapter_name)
File "C:\Users\zer0int\AppData\Roaming\Python\Python310\site-packages\peft\tuners\tuners_utils.py", line 372, in inject_adapter
self._create_and_replace(peft_config, adapter_name, target, target_name, parent, current_key=key)
File "C:\Users\zer0int\AppData\Roaming\Python\Python310\site-packages\peft\tuners\lora\model.py", line 223, in _create_and_replace
new_module = self._create_new_module(lora_config, adapter_name, target, **kwargs)
File "C:\Users\zer0int\AppData\Roaming\Python\Python310\site-packages\peft\tuners\lora\model.py", line 320, in _create_new_module
raise ValueError(
ValueError: Target module GeometricLinear() is not supported. Currently, only the following modules are supported: `torch.nn.Linear`, `torch.nn.Embedding`, `torch.nn.Conv2d`, `transformers.pytorch_utils.Conv1D`.
I'd probably have to hack PEFT itself, and that'll presumably be a somewhat extreme project. Well, at least I have performed the surgery on the BiG-G for now! 🙃
Hello again!
Would it be possible to modify the GMP fine tune script to train a LoRA with PEFT for the CLIP VIT-G model? Then merge the LoRA with the model to get a new CLIP-G model?
Chat-GPT seems to think you can do it. But you have some clip module that only has certain clip models from openai, and not the clip G which I think is from LAION?