UKGovernmentBEIS / inspect_ai

Inspect: A framework for large language model evaluations
https://inspect.ai-safety-institute.org.uk/
MIT License
567 stars 98 forks source link

No support for Ollama/offline models? #13

Closed jimccadm closed 4 months ago

jimccadm commented 4 months ago

It would be useful if you could incorporate testing of offline LLMs as this community of models is growing rapidly and with smaller models performing not as well, plus some doing really well, this toolset could be very useful to help set standards and highlight risks (and benefits in stronger model responses) to users.

Main site: www.ollama.com

Great to see this btw, awesome work!!!

aisi-inspect commented 4 months ago

Thanks! Currently support for offline models exists but only via HF transformers (https://ukgovernmentbeis.github.io/inspect_ai/models.html#sec-hugging-face-transformers). We'd definitely like to see more offline providers though!

We need to make one improvement and then it will be possible to implement support for other providers in external packages. So, for example, the ollama-python package could include an Inspect model provider, which would e.g. make model="ollama/llama3" "just work" when its installed alongside inspect.

Will follow up on this thread once we have the plumbing in place to make this work.

aisi-inspect commented 4 months ago

Discovered that Ollama actually supports the OpenAI API directly, so it was quite easy to add to Inspect. This is now available in the v0.3.9 release. Assuming you have Ollma running locally and have the llama3 model pulled, you can use it as follows:

inspect eval example.py --model ollama/llama3

More details in the full docs on models: https://ukgovernmentbeis.github.io/inspect_ai/models.html

jimccadm commented 4 months ago

Fantastic work aisi-inspect team!

mchiang0610 commented 4 months ago

Thank you @jimccadm for the issue, and thank you @aisi-inspect for being so fast