Evaluate with lm-evaluation-harness

turboderp / exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs

MIT License

3.53k stars 273 forks source link

Evaluate with lm-evaluation-harness #326

Closed OfficialDelta closed 8 months ago

OfficialDelta commented 8 months ago

Is it possible to evaluate XLM2 models with https://github.com/EleutherAI/lm-evaluation-harness?

I'm trying to do this, but I'm not able to figure out how. One way I found as a workaround is using tabbyAPI to host it and connect as if it was a custom endpoint, but this doesn't allow for features like multiple choice question answering.

Another way to fix this would be if it was possible to load XLM2 models with Transformers, and I haven't found a way to do that yet.

turboderp commented 8 months ago

The easiest way would probably be using TabbyAPI to host a local OAI compatible server and then use lm-evaluation-harness with the --model local-chat-completions.

OfficialDelta commented 8 months ago

Okay, that's what I ended up doing, thank you.