-
Change Supported model formats to "Both" (GGML and GGUF) for koboldcpp
-
This is a very popular method of interacting with models. Frequent updates and good performance. Can it be used with your program?
-
Hello! This app is super, I would not even call it an app, it feels alive.
A question about llm settings (sorry, I am a noob). I am running this with command-r-plus, and it works amazingly.
Is…
-
Is it possible to use a custom link for endpoints like [**koboldcpp**](https://github.com/LostRuins/koboldcpp) or [**LM Studio**](https://lmstudio.ai/) so that a local LLM model can be utilized?
Will…
-
**Description**
For some time there is an option to use Q8 and Q4 KV cache in llama.cpp. It is present for example in KoboldCPP and works great there.
Using quantized KV cache reduces VRAM require…
-
```
➜ koboldcpp git:(concedo) python ./koboldcpp.py --help |rg UI # --skiplauncher Doesn't display or use the GUI launcher.
--skiplauncher Doesn't display or use the GUI launc…
Huge updated
2 months ago
-
I have Kobold cpp up and running from the launcher, I didn't configure any settings in the web gui.
And i open up a command prompt and send a curl request.
`
curl -H "Content-type:application/jso…
-
D:\AI>koboldcpp.exe --threads 2 --blasthreads 2 --nommap --usecublas --gpulayers 50 --highpriority --blasbatchsize 512 --contextsize 8192
Welcome to KoboldCpp - Version 1.62.2
For command line arg…
-
XTTS seems to cut out early before response is finished.
set chunks to --wav-chunk-sizes=100,200,300,400,9999
no go.
Sillytavern proper with Koboldcpp.exe and another model with extras enabled …
-
Is it possible to use llama openai compatibility ?
https://github.com/ggerganov/llama.cpp/discussions/795
What changes would be required?