MoE-LLaVA实验的llava-bench-in-the-wild的evaluation

LINs-lab / DynMoE

[Preprint] Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models

https://arxiv.org/abs/2405.14297

Apache License 2.0

50 stars 9 forks source link

MoE-LLaVA实验的llava-bench-in-the-wild的evaluation #2

Closed sharkdrop closed 1 month ago

sharkdrop commented 1 month ago

作者，您好，关于llava-bench-in-the-wild的evaluation，使用的gpt_model是gpt-3.5-turbo还是其他模型啊

QAQdev commented 1 month ago

您好，感谢您的关注。我们在所有需要gpt评估的benchmark上均使用gpt-3.5-turbo

sharkdrop commented 1 month ago

感谢您的及时回复，那么对于如下evaluation结果，应该汇报哪个值啊：

QAQdev commented 1 month ago

@haotian-liu I have the same question, can you share which result to present? @OliverLeeXZ and @g-h-chen were you able to find the correct strategy? @HenryHZY Could you expand on your response?

I think the result is the upper left value, that is, the first value.

Under this setting, The result of model-zoo.md matches the average result of the three attached results in eval.zip.

可以参考这里：https://github.com/haotian-liu/LLaVA/issues/958#issuecomment-1989556206

sharkdrop commented 1 month ago

感谢！