Psycoy / MixEval

The official evaluation suite and dynamic data release for MixEval.
https://mixeval.github.io/
224 stars 34 forks source link

How to use a OS/local Model for LLM as Judge Response Evaluation #45

Open carstendraschner opened 1 month ago

carstendraschner commented 1 month ago

HI @Psycoy

I need to have the option to evaluate the Benchmark with an Open Source Model as LLM-Judge. How Can I do that, if this is note possible shall we work on a PR? I have started a PR: https://github.com/Psycoy/MixEval/pull/46 Reason for this issue...We might face:

regards Carsten

Psycoy commented 1 week ago

Hi, I have merged the PR and now it should be able to use. Thanks for your great efforts!