Open orrzohar opened 2 weeks ago
Hi
I noticed that the learning rate for the instruction tuning phase (1e-7) is 100 times smaller than what was reported in the COG-VLM technical report (1e-5).
https://github.com/THUDM/CogVLM2/blob/57e5a80e996a0e36d9302e9efa3f63cfc29d3368/finetune_demo/peft_lora.py#L185
What is the reason for this? is this due to the LLaMA3? why is COG-VLM2 so much less stable?
++ When you fine-tuned COG-VLM2, did you also only do LoRA?
Best, Orr
No response
-
System Info / 系統信息
Hi
I noticed that the learning rate for the instruction tuning phase (1e-7) is 100 times smaller than what was reported in the COG-VLM technical report (1e-5).
https://github.com/THUDM/CogVLM2/blob/57e5a80e996a0e36d9302e9efa3f63cfc29d3368/finetune_demo/peft_lora.py#L185
What is the reason for this? is this due to the LLaMA3? why is COG-VLM2 so much less stable?
++ When you fine-tuned COG-VLM2, did you also only do LoRA?
Best, Orr
Who can help? / 谁可以帮助到您?
No response
Information / 问题信息
Reproduction / 复现过程
-
Expected behavior / 期待表现
-