Closed 2U1 closed 2 months ago
@2U1 Hi would like make a pull request with you addition?
@2U1 please :raised_hands:
@leestott @franperezlopez Yes sure!
I'd like to understand why the CLIP model can't be trained using LORA, as stated in this comment https://github.com/microsoft/Phi-3CookBook/blob/20d56d79cfd38eb175118ecc961a9b49e2341de2/code/04.Finetuning/vision_finetuning/finetune_hf_trainer_docvqa.py#L94
I made myself a lora_config based on this code, and so far, it worked
linear_modules = [
# CLIP modules
'q_proj', # attention
'k_proj',
'v_proj',
'out_proj',
'fc1', # MLP
'fc2',
# 'img_projection.0',
# 'img_projection.2',
# FIXME: can't lora CLIP is a known issue of Phi-3-V
# Phi language modules
'qkv_proj', # attention
'o_proj',
'down_proj', # MLP
'gate_up_proj',
# 'lm_head',
]
lora_config = LoraConfig(
r=rank,
lora_alpha=round(rank * alpha_to_rank_ratio),
lora_dropout=dropout,
target_modules=linear_modules,
init_lora_weights='gaussian',
task_type=TaskType.CAUSAL_LM,
modules_to_save=["lm_head"],
)
I was a bit busy doing some other works. I'm a bit struggling with making my codes in one script that I've changed some codes in the processor and image_embedding. I'll make it asap.
@ChenRocks
Interesting comment:
I'd like to understand why the CLIP model can't be trained using LORA, as stated in this comment
I made myself a lora_config based on this code, and so far, it worked
linear_modules = [ # CLIP modules 'q_proj', # attention 'k_proj', 'v_proj', 'out_proj', 'fc1', # MLP 'fc2', # 'img_projection.0', # 'img_projection.2', # FIXME: can't lora CLIP is a known issue of Phi-3-V # Phi language modules 'qkv_proj', # attention 'o_proj', 'down_proj', # MLP 'gate_up_proj', # 'lm_head', ] lora_config = LoraConfig( r=rank, lora_alpha=round(rank * alpha_to_rank_ratio), lora_dropout=dropout, target_modules=linear_modules, init_lora_weights='gaussian', task_type=TaskType.CAUSAL_LM, modules_to_save=["lm_head"], )
on latest main branch this is no longer a limitation
@leestott @ChenRocks Oh I was too late for this. I'll close the issue, becuase it will be too much for a cookbook when changing the other things I've made.
Thanks for updating the code :) !
This issue is for a: (mark with an
x
)Mention any other details that might be useful
When fine-tuning the vision-model I think it's possible for fine-tuning the vision model with non-lora and fine-tune the language model with lora.
I've made a code for this by borrowing code from llava. I hope this could be helpful for updating the training script for fine-tuning vision model.
https://github.com/2U1/Phi3-Vision-ft