Open Vital1162 opened 1 week ago
Hello, are you referring to using a vision model and a language model to build an MLLM?
I'm impressed by InternVL and would like to have a tutorial/documentation on how you combine these models (vision model + MLP + LLMs) together so that they can be more accessible to newbies like me. Thank
📚 The doc issue
Is there any tutor for integrating the vision model with the language model?
Suggest a potential alternative/fix
No response