-
### Description
on my laptop with a gtx 1060 in it, i have the cpu backend and the cuda backend installed. However the native all log does not show it even loading cuda, but instead shows it trying…
-
# Prerequisites
Please answer the following questions for yourself before submitting an issue.
- [x] I am running the latest code. Development is very rapid so there are no tagged versions as of…
-
Add the new Multi-Modal model of mistral AI: pixtral-12b:
https://huggingface.co/mistral-community/pixtral-12b-240910
-
I tried to play with mixtral Q5_K_M quant both on llama.cpp and python. Both servers use cuda 12, but same on 11.6. Here are some results:
llama cpp, A100
```
llama_print_timings: load tim…
-
**LocalAI version:**
```
v1.25.0-cublas-cuda12-ffmpeg
```
**Environment, CPU architecture, OS, and Version:**
```
# uname -a
Linux localai-ix-chart-f8bbbb7c7-x6xx9 6.1.42-production+truen…
-
```
(privateGPT) C:\Users\AJAY\Desktop\PrivateGpt>$env:CMAKE_ARGS = "-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS"
The filename, directory name, or volume label syntax is incorrect.
…
-
Hello guys, i try to run mpt-7b model , and i am getting this code, i appreciate any help, here is the detail
Node.js v19.5.0
node_modules\llama-node\dist\llm\llama-cpp.cjs:82
this.inst…
-
Additionally, it might be interesting to consider adding GGUFed version of more models?
![image](https://github.com/user-attachments/assets/0414d354-47db-4dfa-b4de-cd3443217bbe)
-
I upgraded from an older version, and experienced a disturbingly long read-ahead time.
The load on my machine is about the same (a bit higher with python, but that's understandable)
I tried to speci…
-
Hi all,
We recently developed a fully open-source quantization method called VPTQ (Vector Post-Training Quantization) [https://github.com/microsoft/VPTQ](https://github.com/microsoft/VPTQ) which en…