Closed Baiqi-Li closed 2 weeks ago
Hey, I have fixed the code style according to pre-commit requirements. Please review and check the codes. Thank you !
Hey, I have fixed the code style according to pre-commit requirements. Please review and check the codes. Thank you !
Hi, We evaluated InternVL2-8B on NaturalBench, and the results are slightly lower than those reported in the paper, as shown below. Could you please let us know if this discrepancy is significant?
@PhoenixZ810 @linzhiqiu The score is only about 1.% lower than what's reported in the paper, which I think is reasonable. We ensured that the dataset we uploaded is exactly the same as the one used in the paper. Possible reasons for the discrepancy include:
We will conduct further tests across multiple open-source platforms and consider updating the results in the paper and on the leaderboard. Thank you very much !
Add NaturalBench(NeruIPS24): paper dataset