Version of ChatGPT 3.5 for evaluation.

Thank you for your excellent work!

I have some confusion regarding the evaluation process. In this file, it appears that you use ChatGPT-turbo-0327 as the default setting. However, in this script, a different version is specified.

Since the evaluation results vary significantly between versions (by about 10%), I am curious which version is ultimately used. I attempted to reproduce the results by inference alone and found that I only achieved scores similar to your report when using ChatGPT-turbo-0327, not ChatGPT-turbo-0613.

Could you please clarify which version is used for the final evaluation?

Thank you!

LLaVA-VL / LLaVA-NeXT

Version of ChatGPT 3.5 for evaluation. #57