-
I attempted to run a low-level API in version 0.2.11, but both installing from pypi and compiling from source failed.
python: 3.10.12
llama_cpp_python: 0.2.11
```bash
{llama-cpp-python/examples/lo…
islwx updated
2 months ago
-
### Your current environment
- vLLM openai docker image: v0.3.2
- Nvidia A100 GPU
- Nvidia Cuda Toolkit: 12.2
### 🐛 Describe the bug
As the number of concurrent requests increases, GPU utilizatio…
-
Building wheel for llama-cpp-python (pyproject.toml) ... error
error: subprocess-exited-with-error
× Building wheel for llama-cpp-python (pyproject.toml) did not run successfully.
│ exit co…
-
### Summary
Enable CANN support for WASI-NN ggml plugin.
### Details
Adding CANN support to the WASI-NN ggml plugin is relatively straightforward. The main changes involve adding the following code…
-
# Notes
The errors is happening in `ExportedProgram.run_decompositions()` call: message is `Cannot view a tensor with shape torch.Size([1, 512, 32, 128]) and strides (2097152, 128, 65536, 1) as a t…
-
### Bug Report
Since installing v3.2.1, selecting any Llama3 model causes application to crash. Prior to install v3.2.1 the models worked as expected without issue.
### Steps to Reproduce
1. …
-
### Contact Details
_No response_
### What happened?
I just downloaded [Meta-Llama-3.1-8B-Instruct.Q5_K_M.llamafile](https://huggingface.co/Mozilla/Meta-Llama-3.1-8B-Instruct-llamafile/blob/m…
-
### What is the issue?
For the following basic AI test:
```
I have 10 apples. I find 3 gold coins in the bottom of a river. The river runs near a big city that has something to do with what I can s…
-
The command to run the `serach-api-server.wasm` .
```
wasmedge --dir .:. --env LLAMA_LOG="info" \
--nn-preload default:GGML:AUTO:Meta-Llama-3.1-8B-Instruct-Q5_K_M.gguf \
search-api-server.…
-
Hi,
I have a question regarding the huggingface model weights.
I was trying to load some your adapters and play with them but I found that the adapters were very large (~4GB) as in the screenshot be…