-
# Expected Behavior
When editing the last prompt, only the part starting from the first edited word should be processed.
# Current Behavior
This currently works without context shifting. Howe…
-
I'm trying to run on gpu only. I'm getting these endless warnings when I run the command `make LLAMA_HIPBLAS=1 && \
./koboldcpp.py` Even if I ignore the warning and run ./koboldcpp.py on its own, hip…
-
[kobold_debug.json](https://github.com/henk717/KoboldAI/files/15272513/kobold_debug.json)
For some reason token streaming just does not work. It's enabled and the actual terminal output from the se…
-
I tried Qwen2-72B-Instruct with both this quantization: https://huggingface.co/bartowski/Qwen2-72B-Instruct-GGUF/blob/main/Qwen2-72B-Instruct-Q4_K_M.gguf
And this one: https://huggingface.co/mraderma…
-
OS is win11, I notice koboldcpp 1.64.1 has vulkan driver support, so I make a nice try with my AMD 6800U, 32GB ram, 3GB vram with GPU shared memory. Its total vram could be boosted to 17GB. It has vul…
-
[Mantella mod](https://github.com/art-from-the-machine/Mantella) introduces the possibility of talking to[ Skyrim NPCs](https://www.nexusmods.com/skyrimspecialedition/mods/98631), revolutionizing the…
-
Hello guys ! I dont know if I can pose these questions...
So I want to know few things.
At first I work on a windows 11 computer. My setup is:
I5 10400F
16 go ram
RX6600XT
7B hf llama model
C…
-
-
A new sampler named 'DRY' appears to be a much better way of handling repetition in model output than the crude repetition penalty exposed by koboldcpp.
https://github.com/oobabooga/text-generatio…
-
A number of open source models like LLaMa 2 can run in local environments where there's a webserver (like LM Studio, Koboldcpp, etc.) that has identical endpoints to OpenAI. Can we have a flag/option …