nvtransfer / RULER

This repo contains the source code for RULER: What’s the Real Context Size of Your Long-Context Language Models?
Apache License 2.0
646 stars 43 forks source link

gpt-4o results? #12

Open the21st opened 5 months ago

the21st commented 5 months ago

Would love to see results for gpt-4o. There was some claimed improvement in its abilities: http://nian.llmonpy.ai/

hsiehjackson commented 5 months ago

We also plan to run evaluation for gpt-4o! Looks like gpt-4o has large improvement to solve lost-in-the-middle issue.

mirek190 commented 3 months ago

so ? when?

impredicative commented 1 month ago

Looking forward to it for the latest versions of 4o and 4o-mini!