booydar / babilong

BABILong is a benchmark for LLM evaluation using the needle-in-a-haystack approach.
Apache License 2.0
141 stars 16 forks source link