nvtransfer / RULER

This repo contains the source code for RULER: What’s the Real Context Size of Your Long-Context Language Models?
Apache License 2.0
646 stars 43 forks source link

128K sequence length means 131072 or 128000 #34

Closed syp1997 closed 3 months ago

hsiehjackson commented 3 months ago

Normally, it means 131072 but some models use 128000 (like GPT4).