LostRuins / koboldcpp

A simple one-file way to run various GGML and GGUF models with a KoboldAI UI
https://github.com/lostruins/koboldcpp
GNU Affero General Public License v3.0
4.41k stars 319 forks source link

Why koboldcpp works better than llamacpp ? #372

Open LeLaboDuGame opened 11 months ago

LeLaboDuGame commented 11 months ago

Hello guys ! I dont know if I can pose these questions... So I want to know few things.

At first I work on a windows 11 computer. My setup is: I5 10400F 16 go ram RX6600XT 7B hf llama model CLBLAST

I try a serval ways to execute llama models: -llamacpp -koboldcpp -python api -...

And my result is that kobold ai with 7B models and clblast work better than other ways. I have a better perfomance and a better output.

AND I WANT TO KNOW WHY AND HOW !

I explain, I pose this question because I want to create a personal assistant who use ai. For that I have to use some api so llama python api is a good way. But I dont have the same amazing result that kobold ai gives it to me.

Kobold cpp have a api ? In python ? Or how kobold cpp have a better result ? Is it because of prompt input or of hyper parameters ?

Thanks in advance ! All responce are appreciate !

PS:English is not my first language (I'm young lol)

LostRuins commented 11 months ago

Yes KoboldCpp does have an API. You can refer to the wiki https://github.com/LostRuins/koboldcpp/wiki for more info, as well as an API reference linked there. As for better performance it should actually be quite close to upstream, but perhaps with better default configs?