model evaluation - Githubissues

mactavish91 commented 9 months ago

Thank you for your great evaluation, we have recently used a training strategy similar to LLava, which co-trains vqa and chat data, resulting in significant improvements. Could you re evaluate our model? https://github.com/THUDM/CogVLM/

xiangyue9607 commented 9 months ago

Thanks! We are happy to reevaluate your model again. On the other hand, it would be great if you could run the evaluation yourself and provide your validation and test set predictions. Then we can update the leaderboard with a fair and accurate score.

Thanks, Xiang

Outlook for iOShttps://aka.ms/o0ukef

From: mactavish91 @.> Sent: Friday, December 1, 2023 12:55:58 AM To: MMMU-Benchmark/MMMU @.> Cc: Subscribed @.***> Subject: [MMMU-Benchmark/MMMU] model evaluation (Issue #3)

Thank you for your great evaluation, we have recently used a training strategy similar to LLava, which co-trains vqa and chat data, resulting in significant improvements. Can you re evaluate our model? https: //github. com/THUDM/CogVLM/ —Reply

Thank you for your great evaluation, we have recently used a training strategy similar to LLava, which co-trains vqa and chat data, resulting in significant improvements. Can you re evaluate our model? https://github.com/THUDM/CogVLM/https://urldefense.com/v3/__https://github.com/THUDM/CogVLM/__;!!KGKeukY!xypCutw-8lR8kSlup8D3wycxuEm_evM_fM_k-AvFwq5CExRgsLZK4oqAYtFs5BOfky-z4F4oQlDyZ_0L0kfMcYOjUw$

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/MMMU-Benchmark/MMMU/issues/3__;!!KGKeukY!xypCutw-8lR8kSlup8D3wycxuEm_evM_fM_k-AvFwq5CExRgsLZK4oqAYtFs5BOfky-z4F4oQlDyZ_0L0kc0vkyDSw$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ADRC4H5VOAXALW5ME3LBVZDYHFWO5AVCNFSM6AAAAABACKBCKOVHI2DSMVQWIX3LMV43ASLTON2WKOZSGAZDAMJVGIYDQMA__;!!KGKeukY!xypCutw-8lR8kSlup8D3wycxuEm_evM_fM_k-AvFwq5CExRgsLZK4oqAYVFs5BOf0eR5hwrBvGGAdLJgQchEgC7Mfg$. You are receiving this because you are subscribed to this thread.Message ID: @.***>

xiangyue9607 commented 9 months ago

You can submit your test set evaluation at eval.ai: https://eval.ai/web/challenges/challenge-page/2179. We will update your results in our paper based on your submission.

MMMU-Benchmark / MMMU

model evaluation #3