LiveCodeBench / LiveCodeBench

Official repository for the paper "LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code"
https://livecodebench.github.io/
MIT License
131 stars 14 forks source link

HuggingFace Hub #23

Open WesleyTheGeolien opened 1 month ago

WesleyTheGeolien commented 1 month ago

Hello,

Thank you for your hard work.

I tried to run the code bench locally (on a RTX 3060 12Gb) but was hitting issues, I know however though that it is possible to use Hugging face hub inference, would this be of interest to setup a Hugging face hub runner? I may have some scope to contribute if that could be of use.

Naman-ntc commented 1 month ago

Hi, that sounds like a nice idea. Similarly, we might also want to support together or fireworks like API providers.

WesleyTheGeolien commented 1 month ago

Yeah sounds like a good idea!

So thinking from a software point of view maybe an abstract API Runner class that maybe sets up the runner and client then implementations per api provider, I think the openai runner could move to this sort of framework potentially as well?

From a quick test I had to:

Naman-ntc commented 1 month ago

Thanks, yes this sounds reasonable. Adding an additional flag to choose between api provider or vllm sounds reasonable.

kartikzheng commented 3 weeks ago

It would be great to have api support, thanks for your contributions.

rodion-m commented 3 weeks ago

Also, OpenRouter support is very appreciated. With OR we can choose almost any model.

kartikzheng commented 2 weeks ago

Hello, I have added support for the OpenAI-style interface, through the http interface from the inference framework such as vllm, you can directly call to obtain the generated content. I can contribute the code if needed.

rodion-m commented 2 weeks ago

Hello, I have added support for the OpenAI-style interface, through the http interface from the inference framework such as vllm, you can directly call to obtain the generated content. I can contribute the code if needed.

Wow! It's very cool! PR please. @Naman-ntc please approve.