-
# Prerequisites
I am running the latest code. Development is very rapid so there are no tagged versions as of now.
I carefully followed the [README.md](https://github.com/abetlen/llama-cpp-python/b…
DDXDB updated
7 months ago
-
The version of llama-cpp-python this project uses is quite old. Therefore I get a lot of errors regarding versions of GGML models. It also doesn't support GGUF models.
I would suggest to up the ver…
-
### What is the issue?
I have deployed ollama using the docker image 0.3.10. Loading "big" models fails.
llama3.1 and other "small" models (e.g. codestral) fits into one GPU and works fine. llama3.1…
-
Pulled latest with updated llama.cpp in the talk-llama example.
Build is failing on:
https://github.com/ggerganov/whisper.cpp/blob/master/examples/talk-llama/llama.cpp#L1116
`WHISPER_CUBLAS=…
-
Even though I'm using a gpu build, inference is running on cpu/ram. I tried to tinker with parameters, but with no luck.
Log:
```
Godot Engine v4.3.stable.mono.official.77dcf97d8 - https://go…
-
hi, trying to find out how to use JSON outputs, any example appreciated, digging into the code in the meantime!
-
Ability to pass through custom JSON parameters to APIs
Useful for things like:
- Anthropic steering
- Customizing Chapter II ems
- llama.cpp custom sampling parameters
The most gormed way t…
-
When I run the docker container I see that the GPU is only being used for the embedding model (encoder), not the LLM.
I noticed that llama-cpp-python is not compiled properly (Notice: BLAS=0), as d…
-
### What is the issue?
qwen4b works fine, all other models larger than 4b are gibberish
```
time=2024-09-05T11:35:49.569+08:00 level=INFO source=download.go:175 msg="downloading 8eeb52dfb3bb in 1…
-
## Overview
## Tasklist
- [ ] Can this be solved via llama.cpp? (e.g. compiled for Vulkan and ROCm)
- [x] https://github.com/janhq/cortex.llamacpp/issues/9
- [ ] [https://github.com/janhq/jan/issues…