-
does it using any mat calculation accelerate framework from rust's lib?
any plan to make it further, for instance, make it like ggml popular
-
Hello everyone!
I'm currently enhancing the GGML implementation of a LSTM network.
My main focus is to avoid having scalability issues with the computational graph.
Currently I'm setting GGML_…
-
@simonJJJ Hi, could you please give some advice for this issue? Qwen-7B-Q4_0 works well on Mac M1, but Qwen-7B-Q8_0 cannot.
```
cmake -B build -DGGML_METAL=ON && cmake --build build -j
./ma…
-
Thanks a lot for this repo!
I cloned the Cheetah repo, and cloned the whisper.cpp repo (both at the same level). I followed the instructions to download the ggml-model. I brew installed sdl2 and al…
-
### Contact Details
_No response_
### What happened?
Recording to some warnings (such as:get_nvcc_path: note: /usr/local/cuda/bin/nvcc.exe does not exist) outputted by the log uploaded, seems llama…
-
When building on a system missing expected CPU flags, it is actually possible for the CMakeLists.txt to never set the `GGML_C_FLAGS` variable. This causes this line to be called with no value: https:/…
-
**Describe the bug**
I'm attempting to eliminate most if not all CPU fallback code paths in my GGML backend. One of which is reshaping a single vector into a 3D tensor - I know this is not efficient o…
-
llama-cpp-python version 0.2.25 | build with cublas -> Nvidia T4
Tested with llava model: https://huggingface.co/mys/ggml_llava-v1.5-13b
```python
chat_handler = Llava15ChatHandler(clip_model_p…
-
At present, we are using GGML's computation graph. This works well, but it has a few flaws:
1) We're reliant on whatever support GGML has for threading; the Rust threading ecosystem is more versati…
-
This project looks amazing!! Unfortunately I cannot self host it without paying for OpenAI models - is there a possibility that you could look into alternative open-source models utilizing [ggml](http…