evaluation of popular models on BABILong

booydar / babilong

BABILong is a benchmark for LLM evaluation using the needle-in-a-haystack approach.

Apache License 2.0

141 stars 16 forks source link

Closed yurakuratov closed 3 months ago

yurakuratov commented 3 months ago