[backend] decision about switching to some library that provides wrapper around accessing open-weight models locally

clp-research / clembench

A Framework for the Systematic Evaluation of Chat-Optimized Language Models as Conversational Agents and an Extensible Benchmark

MIT License

19 stars 26 forks source link

[backend] decision about switching to some library that provides wrapper around accessing open-weight models locally #30

Closed sherzod-hakimov closed 5 months ago

sherzod-hakimov commented 6 months ago

Potential libraries:

ollama
vllm
openllm
fastchat

davidschlangen commented 6 months ago

Clarify title? In what sense do we want to switch? Have we not committed to running local models directly? (That is, the backend is the transformers object.)

As far as I am aware, there is this independent issue of wanting to provide a (locally running, "free") API for explorative work. But the idea is not (necessarily) for the clembench runs to make use of this service, e.g., to run Llamas. Or is it?

davidschlangen commented 5 months ago

Looks like we've decided on FastChat? Close?