-
### Name and Version
```
.\llama-cli.exe --version
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 2 CUDA devices:
Device 0: NVIDIA…
-
### What is the issue?
I tried to import finetuned llama-3.2-11b-vision, but I got "Error: unsupported architecture."
In order to make sure my model is not the problem, I downloaded [meta-llama/Ll…
-
I have run pip install llama-stack, following readme file, though it is installed successfully ,
"llama model list --show-all
llama: command not found"
But I see llama commnad are not getting rec…
-
Hi @lamikr,
I built rocm_sdk_builder on a freshly installed Ubuntu 24.04.1. It took 5 hours, 120GB of storage and many hours of fixing small issues during building the repo (reference: https://gith…
-
### What is the issue?
Hardware has 11.1 GiB (RAM) + 1.9 GiB (GPU) = 13 GiB, but fails to run a 3B model.
Any idea why?
```
Nov 14 17:49:49 fedora ollama[1197]: r14 0x6
Nov 14 17:49:49 fedor…
-
### Problem Description
Adding New gfx model gfx1151 to Linux , it can build on Linux also I can build the llama cpp with rocWMMA patch
https://github.com/ggerganov/llama.cpp/pull/7011/commits to …
-
### 🐛 Describe the bug
```python
from llama_index.core import Settings
from llama_index.memory.mem0 import Mem0Memory
from llama_index.embeddings.ollama import OllamaEmbedding
from llama_index.…
-
# 🧐 Problem Description
Fast-LLM lacks support for Llama 3.x models due to missing compatibility with Llama-3-style RoPE scaling. This prevents us from effectively training or using Llama 3.x check…
-
When I quantified the Qwen2.5-1.5B-instruct model according to "GGUF Export" in the examples.md in the docs, it showed that the quantization was complete and I obtained the gguf model.But when I load …
-
Was told to move this https://github.com/comfyanonymous/ComfyUI/issues/5510 story to ComfyUI-N-Nodes repo.
### Expected Behavior
I'm not sure if Ollama models are required in anyway but I do see…