-
Thanks for creating the awesome project.
So I was trying to play around with the project.
I have a couple of PDF's that i wanted to use (mostly around 300 pages).
I have a laptop 3080 with 16GB …
-
I am getting this error:
```
llama.cpp: loading model from /Documents/Proj/delta/llama-2-7b-chat/ggml-model-q5_1.bin
error loading model: unrecognized tensor type 14
llama_init_from_file: failed…
-
I followed the instructions in readme.md. it built successfully i guess.
but when i run `wasmedge rag-api-server.wasm -h`, i got the following errors:
```
[2024-05-29 18:44:18.672] [error] instan…
-
I just tried the new large-v3-turbo model on translating Japanese anime video into English. Instead of English, it gave the subtitles in Japanese, with each subtitle taking a block of 30 seconds in t…
-
### What happened?
I can no longer build llama.cpp with hipblas enabled. The following dockerfile can be used to reproduce the issue:
```
FROM rocm/pytorch
ARG ROCM_TARGET_LST=/root/gfx
RUN…
-
This is just an idea for you. Most modern smartphones come with some form of AI accelerator. I am aware GGML-based projects like llama.cpp can compile and run on mobile devices, but there is probably …
-
Hi,
I am running llama-cpp-python on surface book 2 having i7 with nvidea geforce gtx 1060.
I installed vc++, cuda drivers 12.4
Running on Python 3.11.3
Compiled llama using below command on Min…
-
### Description
Hi, I am using the latest version of LLamaSharp and my model is Llama-3 70b gguf version, when the number of GpuLayerCount is 0 to 5, although it is not very fast, I get the answer, b…
-
[GGUF](https://huggingface.co/docs/hub/en/gguf) is becoming a preferred means of distribution of FLUX fine-tunes.
Transformers recently added general support for GGUF and are slowly adding support …
-
# Prerequisites
Please answer the following questions for yourself before submitting an issue.
- [x] I am running the latest code. Development is very rapid so there are no tagged versions as of…