-
### Prerequisites
- [X] I am running the latest code. Mention the version if possible as well.
- [X] I carefully followed the [README.md](https://github.com/ggerganov/llama.cpp/blob/master/README.md)…
-
In https://github.com/instructlab/instructlab/pull/1797 we enabled batching unconditionally with remote endpoints.
However, we know it doesn't work with llama-cpp - see e.g. https://github.com/inst…
-
### Prerequisites
- [X] I am running the latest code. Mention the version if possible as well.
- [X] I carefully followed the [README.md](https://github.com/ggerganov/llama.cpp/blob/master/README.md)…
-
https://levelup.gitconnected.com/live-indexing-for-rag-a-guide-for-real-time-indexing-using-llamaindex-and-aws-51353083ace4
This will allow us to track with matomo usage
-
The llama.cpp integration within the playbook does not works, anyway i have manually created the gguf file but when i try to serve the model using the llama.cpp server i am getting the following error…
-
Hello, I get this error when running the model locally, as described by the repo. however, I get this error:
index 2 is out of bounds for dimension 1 with size 2
with this traceback:
```
/py…
-
Please create the following browser wasm demos-
1) Stable diffusion with W8A8 quantization- This is important because the stable diffusion [demo](https://intel.github.io/web-ai-showcase/) which I sa…
-
I installed and unpacked LlamaAssistant-macOS-arm64.zip
The application started correctly, the interface was normal, but when creating a request (in English) and selecting any analysis, the applica…
-
`docker run llamastack/llamastack-local-gpu:latest` does nothing
-
Hello, I've been asking a lot of questions today. After building the Android phone app I created as an example and installing it on a Galaxy S22 model with 12GB of memory, I found that only the Llama …