-
**Description**
For some time there is an option to use Q8 and Q4 KV cache in llama.cpp. It is present for example in KoboldCPP and works great there.
Using quantized KV cache reduces VRAM require…
-
**The bug**
When I try to load a model with the LlamaCpp loader, I have the following error :
```
Exception ignored in:
Traceback (most recent call last):
File "/Users/mathieu.tammaro/Work/Per…
-
Hello guys ! I dont know if I can pose these questions...
So I want to know few things.
At first I work on a windows 11 computer. My setup is:
I5 10400F
16 go ram
RX6600XT
7B hf llama model
C…
-
### Cortex version
cortex-1.0.0-rc1-windows-amd64.tar.gz
### Describe the Bug
cortex v1.0.0-rc1 can not start server。
### Steps to Reproduce
1.cortex start -p 1234
2.display: Could not start ser…
-
**Describe the bug**
I am trying to fine tune `phi-2` model on custom dataset using `ilab model train` command. The command downloads the model successfully from huggingface however it later fails wi…
-
### Checked other resources
- [X] I added a very descriptive title to this issue.
- [X] I searched the LangChain documentation with the integrated search.
- [X] I used the GitHub search to find a sim…
-
## Expected Behavior
We should be able to select different backends. Need at least to support :
1- llamacpp
2- gpt-j
3- hugging face transformers models
## Current Behavior
Only llamacpp is su…
-
### Cortex version
Jan v0.5.7 | Cortex v-
### Describe the Bug
https://discord.com/channels/1107178041848909847/1300098068980568095
A known issue exists with the llama.cpp engine’s handling of s…
-
```
cli.py", line 172, in main
if model.is_antiprompt_present():
AttributeError: 'llamacpp.llamacpp.LlamaInference' object has no attribute 'is_antiprompt_present'
```
-
When I create `conda` environment using these steps:
```
conda create --name gguf-to-torch python=3.12 -y
conda activate gguf-to-torch
conda install pytorch torchvision torchaudio pytorch-cuda…