Open pnewhook opened 1 year ago
Hello, we currently don't support external APIs, only generations with transformers
. Feel free to open a PR if you have something in mind.
If you're interested in HumanEvalPack benchmarks and OpenAI models there's a task that supports it(docs here)https://github.com/bigcode-project/bigcode-evaluation-harness/blob/main/lm_eval/tasks/humanevalpack_openai.py
is there a way i could 'fake' a local model and have it call a hosted API endpoint? @pnewhook @loubnabnl
I don't think that's possible with current setup which uses transformers
loading that assumes you have the model checkpoint
I have a model that's only available through a RESTful API, and need to get some benchmarks. I'd like to run MultiPL-E benchmarks with a few languages. Has any work gone into using bigcode-evaluation-harness to perform generation with an API instead of on the local machine?
lm-evaluation-harness has the ability to run against commercial APIs, especially OpenAI.