Open atazangene opened 3 months ago
Thanks for interest in our work and thanks for your suggestions! MoAI-7B was compared with LLaVA1.6-13B and -34B in our paper's figure 6. We will compare MoAI with Qwen-Max and Gemini-Ultra, but our work mainly aims to make LLVMs get real-world scene understanding, so it is wondering whether comparing MoAI-7B and super large LLVMs is really fair comparison. We unfortunately do not have GPU rich resources and funds therefore 34B is impossible to release on the current state. Creating a video to explain the training steps in a simple manner sounds like a great idea. Thanks again!
I just saw this model and I think it's really amazing; it's great work. In order to improve this model, I have some suggestions:
Compare model with the latest opened/closed models
I think it would be better to compare it with the latest models, such as Llava 1.6 and Qwen Max, which are currently the highest-end models available.
Release the 34b model to beat the competitors
If you have sufficient funds and resources, I think it would be beneficial to release the 34b model.
Create a video tutorial on how to fine-tune this model
One of the biggest problems with other models is that they only provide limited text documentation on how to fine-tune them. However, I believe that with a video tutorial, you can make a greater impact online and draw a lot of attention to this model.