allenai / OLMoE

OLMoE: Open Mixture-of-Experts Language Models
https://arxiv.org/abs/2409.02060
Apache License 2.0
461 stars 35 forks source link

llama.cpp / GGUF support #7

Open sammcj opened 2 months ago

sammcj commented 2 months ago

It would be great to see OLMoE/OlmoeForCausalLM Llama.cpp/GGUF support.

Really neat project!

AmitKKhanchandani commented 2 months ago

+1, need to try this with ollama :)

Bobetele commented 2 months ago

yes

Muennighoff commented 2 months ago

won't have bandwidth to do this, but if anyone is interested, that'd be amazing!

MrDowntempo commented 2 months ago

Yeah, this is hard to work with if it isn't in GGUF format to run locally, or available from Ollama directly. I'm looking into how to serve from Safetensors but not a lot of servers support that.

Muennighoff commented 2 months ago

also cc @2015aroras

2015aroras commented 2 months ago

See https://github.com/ggerganov/llama.cpp/pull/9462

Meshwa428 commented 2 months ago

See https://github.com/ggerganov/llama.cpp/pull/9462

It still isn't merged 😞