LLaVA-VL / LLaVA-NeXT

1.01k stars 55 forks source link

Version of ChatGPT 3.5 for evaluation. #57

Open Hon-Wong opened 2 weeks ago

Hon-Wong commented 2 weeks ago

Thank you for your excellent work!

I have some confusion regarding the evaluation process. In this file, it appears that you use ChatGPT-turbo-0327 as the default setting. However, in this script, a different version is specified.

Since the evaluation results vary significantly between versions (by about 10%), I am curious which version is ultimately used. I attempted to reproduce the results by inference alone and found that I only achieved scores similar to your report when using ChatGPT-turbo-0327, not ChatGPT-turbo-0613.

Could you please clarify which version is used for the final evaluation?

Thank you!

ZhangYuanhan-AI commented 2 weeks ago

ChatGPT-turbo-0613