Open the21st opened 1 month ago
Would love to see results for gpt-4o. There was some claimed improvement in its abilities: http://nian.llmonpy.ai/
We also plan to run evaluation for gpt-4o! Looks like gpt-4o has large improvement to solve lost-in-the-middle issue.
Would love to see results for gpt-4o. There was some claimed improvement in its abilities: http://nian.llmonpy.ai/