-
Hi! I would like to try to make RWKV v6 models working with ollama.
llama.cpp has it supported already.
- Currently ollama fails to load the model due to a bug in llama.cpp. Here's the fix PR: https…
-
### Steps to Reproduce
1. Make flutter app.
2. install library flutter pub add llama_cpp_dart
3. Download llama.cpp and make a shared library for me this is .libllama.dylib and add to root.
4. …
-
Is this gguf llama 3.2 vision model supported?
https://huggingface.co/leafspark/Llama-3.2-11B-Vision-Instruct-GGUF/tree/main
I tried running the 20 gig 11b model from meta, but it's taking extre…
-
### Discussed in https://github.com/ggerganov/llama.cpp/discussions/9960
Originally posted by **SteelPh0enix** October 20, 2024
I've been using llama.cpp w/ ROCm 6.1.2 on latest Windows 11 for…
-
# Current Behavior
I run the following:
CMAKE_ARGS="-DGGML_CUDA=on" pip install llama-cpp-python --verbose
an error occured:
ERROR: Failed building wheel for llama-cpp-python
# Environment …
-
### Preflight Checklist
- [x] I have read the [Contributing Guidelines](https://github.com/electron/electron/blob/main/CONTRIBUTING.md) for this project.
- [x] I agree to follow the [Code of Conduct]…
-
Hello,
llama.cpp recently added support for an AArch64 specific type of GGUF and AArch64 specific matmul kernels. Here is the merged PR https://github.com/ggerganov/llama.cpp/pull/5780#pullrequest…
-
It would be really cool if llama-swap could terminate the currently running server after it's been idle (has not received any requests) for x seconds. Ideally this value should default to 0 (never ter…
-
I got it to works just like instruction, I'm using CUDA 12.3:
`set CMAKE_ARGS="-DLLAMA_CUBLAS=on" && set FORCE_CMAKE=1 && pip install --no-cache-dir llama-cpp-python==0.2.90 --extra-i…
-
### What happened?
common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)
/owner/ninth/llama.cpp/ggml/src/ggml-cann.cpp:61: CANN error: E89999: In…