OpenGVLab / InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4V. 接近GPT-4V表现的可商用开源多模态对话模型
https://internvl.github.io/
MIT License
4.05k stars 308 forks source link

Any further plans on knowledge distillation? #15

Closed lcxrocks closed 6 months ago

lcxrocks commented 6 months ago

ViT-22B conducted knowledge distillation experiments (refer to Table 8), demonstrating that it is not only a large-scale model but also an excellent teacher. Has there been any consideration or experiments conducted on whether Intern-VL can be distilled into smaller models, given that it serves as the largest open-source vision/vision-language foundation model to date (and a good alternative to the ViT-22B)? Thank you in advance for your attention.

czczup commented 6 months ago

Hello, thank you for your question. We have considered distilling InternVL into smaller models. In the future, we plan to conduct relevant distillation experiments to explore its potential as a teacher for smaller models.

lcxrocks commented 6 months ago

Sounds promising. Hoping to see the potential of this vision-language model being applicable in many more situations. Thank you for your quick response.