InftyAI / llmaz

☸️ Easy, advanced inference platform for large language models on Kubernetes
Apache License 2.0
13 stars 5 forks source link

Support llama.cpp #94

Closed kerthcet closed 3 weeks ago

kerthcet commented 3 weeks ago

What this PR does / why we need it

Now we can serving GGUF models with llama.cpp, what's more, we can run e2e tests with cpu.

Which issue(s) this PR fixes

Fixes https://github.com/InftyAI/llmaz/issues/65

Special notes for your reviewer

Does this PR introduce a user-facing change?

Support llama.cpp as another backend
kerthcet commented 3 weeks ago

/kind feature

kerthcet commented 3 weeks ago

/lgtm /approve