TempleX98 / MoVA

MoVA: Adapting Mixture of Vision Experts to Multimodal Context
Apache License 2.0
107 stars 0 forks source link

Importing and using Huggingface model with transformers #3

Open yoonseo111 opened 1 month ago

yoonseo111 commented 1 month ago

Hello,

First of all, I would like to express my gratitude for the wonderful great research and resources provided by your team. They have been very helpful.

I am currently attempting to use this model from Hugging Face with the Transformers library, but I am encountering errors when trying to import and apply the model. Could you please provide some example code on how to correctly use these models? Any guidance or specific steps on how to successfully run a model would be greatly appreciated.

Thank you for your time and assistance.

Best regards,

YoonSeo KIM

TempleX98 commented 1 month ago

Hi, we have provided the evaluation document to help the user run our model and reproduce the evaluation performance following LLaVA. For example, the specific steps to run this model are presented in https://github.com/TempleX98/MoVA/blob/main/mova/eval/model_vqa_loader.py. BTW, can you show me the errors?