Can GGUF and EXL2 compatibility be added?

Psycoy / MixEval

The official evaluation suite and dynamic data release for MixEval.

https://mixeval.github.io/

196 stars 28 forks source link

Can GGUF and EXL2 compatibility be added? #1

Closed RodriMora closed 2 months ago

RodriMora commented 3 months ago

Hi!

I've been testing MixEval but it requires using the full precision models downloaded from HF. Due to hardware limitations in the selfhosted community quantized models are really popular, specially in the GGUF format (using llama.cpp, high compatibility and decent speed with layers offloaded to vram), and EXL2 (using exllamav2, great speed, requieres model fully loaded in vram).

Psycoy commented 3 months ago

Thank you for raising this issue! We will look into that and add support!

RodriMora commented 3 months ago

A good alternative would be to allow to change the openai_base_url so we can connect to openai's compatible API that most software supports, like llama.cpp server, vllm...

Psycoy commented 2 months ago

okay, will look into it

maziyarpanahi commented 2 months ago

A good alternative would be to allow to change the openai_base_url so we can connect to openai's compatible API that most software supports, like llama.cpp server, vllm...

This would be amazing! should be a pretty straightforward and opens up tons of local LLM applications!

Psycoy commented 2 months ago

I think the current code provides flexible custom model registration. See Registering New Models section for details.

E.g., to register Yi-large, simply copy & paste the GPT-4o model script, change the related names, and change the

self.client = OpenAI(
            api_key=YOUR_KEY,
            timeout=Timeout(timeout=100.0, connect=20.0)
        )

self.client = OpenAI(
            api_key=YOUR_KEY,
            timeout=Timeout(timeout=100.0, connect=20.0),
            base_url="https://api.lingyiwanwu.com/v1"
        )

i.e., simply add the "base_url" field to the OPENAI instance will do if you wish to use the "openai_base_url" feature. You may also want to modify the decoding function to catch some api calling errors (see the mix_eval/models/yi_large.py).

maziyarpanahi commented 2 months ago

I think the current code provides flexible custom model registration. See Registering New Models section for details.

E.g., to register Yi-large, simply copy & paste the GPT-4o model script, change the related names, and change the
self.client = OpenAI(
            api_key=YOUR_KEY,
            timeout=Timeout(timeout=100.0, connect=20.0)
        )
to
self.client = OpenAI(
            api_key=YOUR_KEY,
            timeout=Timeout(timeout=100.0, connect=20.0),
            base_url="https://api.lingyiwanwu.com/v1"
        )
i.e., simply add the "base_url" field to the OPENAI instance will do if you wish to use the "openai_base_url" feature. You may also want to modify the decoding function to catch some api calling errors (see the mix_eval/models/yi_large.py).

Thanks, but I think there is misunderstanding here. I meant to add base_url in the parser for judging at the end so local LLMs can be used instead of OpenAI.

Psycoy commented 2 months ago

Hey @maziyarpanahi and @RodriMora,

I have added the feature. Specify the --api_base_url if you wish to use other api such as llama.cpp server and Azure OpenAI API.

maziyarpanahi commented 2 months ago

Many thanks @Psycoy