-
**Describe the Issue**
Apologies, I am in no means an expert, and I am still learning.
Recently, after upgrading KoboldCPP I have been seeing some strange repetition and other issues with response…
-
I have GPU accelerated training with CUDA / ROCm and BitsAndBytes 4-bit quantization working. See https://github.com/instruct-lab/cli/pull/520#issuecomment-1993645744 for more information. However `la…
tiran updated
1 month ago
-
Just about to add this models to list.
-
I cannot import Llama:
`from llama_cpp import Llama`
This results in
```
{
"name": "RuntimeError",
"message": "Failed to load shared library 'c:\\Users\\user\\Documents\\llamacpp\\llama-…
-
### Cortex version
v176
### Describe the Bug
Worked correctly for v172, regression for v176
1. server is not running
2. `cortex models list`
starts server - is this expected?
then successfull…
-
## Description
am not sure to perfectly understand the following
https://github.com/containers/ai-lab-recipes/blob/55610a8c90b6e72c6e9289513825112e6c5e99b1/model_servers/llamacpp_python/src/run…
-
llamacpp has recently added command-r suppport. Can we get it for llamacpp-python?
https://github.com/ggerganov/llama.cpp/commit/12247f4c69a173b9482f68aaa174ec37fc909ccf
https://huggingface.co/C…
-
### Describe the bug
The software refuse to load the quant of DeepSeek-Coder-V2-Instruct.
### Is there an existing issue for this?
- [X] I have searched the existing issues
### Reprodu…
-
## Overview
- Intel's Lunar Lake is releasing soon, which has CPU, NPU and iGPU in a single chip
## Tasklist
- [x] https://github.com/janhq/cortex.cpp/issues/677
- [x] https://github.com/janhq/cort…
-
I want to deploy it via ollama, so I firstly convert it to .guff file by llama.cpp's convert_hf_to_guff.py,but I got an error that KeyError "",so I found it not in added_tokens_decoder of tokenizer_c…