Fine-tuning Code For InternLM2 Series

OpenGVLab / InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

https://internvl.readthedocs.io/en/latest/

MIT License

5.76k stars 452 forks source link

Fine-tuning Code For InternLM2 Series #398

Closed KaranBhuva22 closed 2 months ago

KaranBhuva22 commented 2 months ago

Checklist

[X] 1. I have searched related issues but cannot get the expected help.
[X] 2. The bug has not been fixed in the latest version.
[X] 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.

Describe the bug

The InternVL2 series is outstanding. We would appreciate it if you could release both the fine-tuning code and detailed documentation on how to fine-tune a InternVL2 series.

Thank you!!

Reproduction

Environment

Error traceback

No response

Road2Redemption commented 2 months ago

Looking forward to these codes. Great Work!

Road2Redemption commented 2 months ago

Are there scripts or codes on fine-tuning InternVL2 models too?

THU-Kingmin commented 2 months ago

KaranBhuva22 commented 2 months ago

Screenshot from 2024-07-24 16-58-01 The image shows a TODO list where @czczup has marked "Release training/evaluation code for InternVL2 series" as complete. However, I cannot locate the actual training code for the InternVL2 series. @czczup it would be great if you can you guide us for the training steps!

chris-tng commented 2 months ago

I'm interested in fine-tuning code as well. It would be also great if it supports multi-image fine-tuning.

Thanks

VietDunghacker commented 2 months ago

From what I have experimented, the training procedure is the same with InternVL 1.5, just different checkpoint. To support multi-image fine-tuning, in the dataset, you only need add multiple image tag \<image> for each image.

KaranBhuva22 commented 2 months ago

@VietDunghacker Have you fine-tuned InternVL2 using 1.5 script? Did it work properly? And what about the results?

VietDunghacker commented 2 months ago

@KaranBhuva22 Yes, I have finetuned it on my custom dataset. The code worked without error so far, and the results are good.

KaranBhuva22 commented 2 months ago

Thanks @VietDunghacker

czczup commented 2 months ago

InternVL2的微调文档已经发布了，可以试试看按照这个指南进行微调。

如果遇到问题欢迎和我反馈，我来修改文档让他更好用～

https://internvl.readthedocs.io/en/latest/internvl2.0/finetune.html

KaranBhuva22 commented 2 months ago

I will look at it into it. Thank you so much @czczup.

royzhang12 commented 2 months ago

InternVL2的微调文档已经发布了，可以试试看按照这个指南进行微调。

如果遇到问题欢迎和我反馈，我来修改文档让他更好用～

https://internvl.readthedocs.io/en/latest/internvl2.0/finetune.html

@czczup 非常感想您提供的详细文档。想请问：

internvl2-76B的微调有发布计划吗？或者可以直接用40B的脚本进行微调吗？
另外使用这个https://github.com/OpenGVLab/InternVL/blob/main/internvl_chat/shell/internvl2.0/2nd_finetune/internvl2_1b_qwen2_0_5b_dynamic_res_2nd_finetune_lora.sh 进行76B微调的时候。由于torchrun会启动多个线程预先加载整个预训练模型 model = InternVLChatModel(internvl_chat_config, vision_model, llm) 。这会导致 CPU 内存不足，进而导致进程终止。请问这个有解决办法吗？