-
### Describe the feature
anyone working on porting llama.cpp to vlang? that'll be something.
### Use Case
llama.cpp being used by vlang
### Proposed Solution
_No response_
### Other Information…
ouvaa updated
5 months ago
-
**Describe the Issue**
llama.cpp exposes the options `--grp-attn-n` and `--grp-attn-w` for the _Group size_ and _Neighbor window size_ hyper parameters from the SelfExtend [paper](https://arxiv.org/…
jojje updated
1 month ago
-
When I use the --lm_model_name flag and the path to one of my local GGUF model files, I get this error: OSError: Incorrect path_or_model_id: [path to file] . Please provide either the path to a loca…
-
**Describe the Issue**
Upstream we have the new feature of ARM optimized models (Q4_0_4_4, Q4_0_4_8 and Q4_0_8_8). I tried to run every one of them at my Snapdragon 8G1, but I was unable to run it wi…
-
My script located here has a basic test now...
[https://github.com/collabora/WhisperSpeech/issues/67](https://github.com/collabora/WhisperSpeech/issues/67)
I'm excited about the concept of rever…
-
### What happened?
![Screenshot_20240625-122328](https://github.com/ggerganov/llama.cpp/assets/26687662/494ac6f9-4467-49a0-a135-1de7bc9ef2f7)
Getting `GGML_ASSERT: ggml.c:21763: svcntb() == QK8_0`…
-
{
"platform":"",
"hub-mirror": [
"ghcr.io/ggerganov/llama.cpp:server-cuda"
]
}
-
### Class | 类型
大语言模型
### Feature Request | 功能请求
Personally, I feel that the sever module of llama-cpp-python is very simple and easy to use, but I have been unable to add this part of the function …
-
This is how I am loading the model using Python, but it uses only the CPU:
`
Llama(model_path="./functionary-7b-v2.q4_0.gguf", n_ctx=4096, n_gpu_layers=50)
`
I have also tried to re-install ll…
-
### Prerequisites
- [X] I am running the latest code. Mention the version if possible as well.
- [X] I carefully followed the [README.md](https://github.com/ggerganov/llama.cpp/blob/master/README.md)…