-
With the new install of Koboldcpp I find the models struggle with narration. I am using previous models, no change there and the latest release of Koboldcpp. When I type something like "continue" norm…
-
**Describe the Issue**
Flash attention is not activated even selected on 1.70 UI
**Additional Information:**
![no flash attention](https://github.com/user-attachments/assets/d61c7725-bde7-49d0-…
-
My AMD 6900XT VRAM would idle to 0mhz after a while and this would tank the performance of koboldcpp. The idling VRAM doesn't affect other workloads like gaming because the VRAM frequency would pop ba…
-
```
===> Building for koboldcpp-1.57.1
[ 1% 4/64] cd /usr/ports/misc/koboldcpp/work/koboldcpp-1.57.1 && /usr/local/bin/cmake -DMSVC= -DCMAKE_C_COMPILER_VERSION=16.0.6 -DCMAKE_C_COMPILER_ID=Clang -…
-
XTTS seems to cut out early before response is finished.
set chunks to --wav-chunk-sizes=100,200,300,400,9999
no go.
Sillytavern proper with Koboldcpp.exe and another model with extras enabled …
-
Hello LostRuins,
I'm impressed with your work on KoboldCpp and I'm interested in potentially sponsoring the project's development.
Could you please provide me with a way to discuss this further?…
-
Hi, can you please include perplexity evaluation from llama.cpp to koboldcpp? There is a separate script for that called perplexity in llama.cpp. Currently looks like this script is not present comple…
-
For reviewing confidential documents, we'd prefer to use our own APIs, either view oobabooga or koboldcpp. Is there the possibility of including options for using alternate APIs, including local or Cl…
-
D:\AI>koboldcpp.exe --threads 2 --blasthreads 2 --nommap --usecublas --gpulayers 50 --highpriority --blasbatchsize 512 --contextsize 8192
Welcome to KoboldCpp - Version 1.62.2
For command line arg…
-
### #
- [ ] I have searched the existing issues
### Current behavior
I see a bunch of stuff on HuggingFace and llama.cpp Git about pre-tokenizers causing issues upon initial release of the qu…