-
### Describe the bug
After downloading a model I try to load it but I get this message on the console:
Exception: Cannot import 'llama-cpp-cuda' because 'llama-cpp' is already imported. Switching to…
-
Command - `local-llm run meta-llama/Llama-3.2-11B-Vision 8000 --verbose`
Issues - Not able to download and run the LLM model
```
Traceback (most recent call last):
File "/home/ubuntu/.local/…
-
### 起始日期 | Start Date
_No response_
### 实现PR | Implementation PR
_No response_
### 相关Issues | Reference Issues
_No response_
### 摘要 | Summary
Hi, I am trying to load this using Llama.CPP HTTP s…
-
## Goal
- Cortex can handle all llama.cpp params correctly
- Model running params (i.e. POST `/v1/models//start`)
- Inference params (i.e. POST `/chat/completions`)
- Function Calling, eg for llama.c…
-
### Description
Have tried a number of huggingface models and consistently get the error message:
llama_model_load: error loading model: done_getting_tensors: wrong number of tensors; expected 292, …
-
This was a problem that I think was prematurely `closed`:
https://github.com/abetlen/llama-cpp-python/issues/1166
My current efforts are to get a llama 3.1 70B gguf running on 2 3090s, and no ma…
-
Hi,
I've trying to serve different Phi3 models using the Llama.cpp server that is created by the init-llama-cpp ipex.
When I server with this version I have two problems:
1) The server doesn…
hvico updated
1 month ago
-
Hi,
I've created a blank Rust project with a single dependency:
```toml
llama_cpp_rs = "0.3.0"
```
However, when I tried to compile it, the build failed with the following error:
```bash…
-
Thank you for developing this useful resource. The Ollama notebook reports
```{"error":"llama runner process has terminated: error loading modelvocabulary: cannot find tokenizer merges in model fi…
-
### Description of the bug:
Hi @pkgoogle ,
i used example c++ code to inference model i transfer, it can show some error.
- my command
```
bazel run -c opt //ai_edge_torch/generative/example…