-
**Problem**
Lack support for VNNI
**Success Criteria**
**Additional context**
-
### Describe the feature request
Wasm Relaxed SIMD includes integer dot product instructions, which will map to VNNI instructions on X86-64 platforms with AVX-VNNI (on ARM maybe SDOT, but I haven't t…
-
I've been working on securing the user input, escaping invalid characters, however I've encountered a few prompts which cause the llama-cli to abruptly halt:
```
.\llama-cli.exe --model "..\..\..\mod…
-
### What happened?
I use -i -if and the flags are ignored, and it exists with "input is empty"
lama_new_context_with_model: graph nodes = 2246
llama_new_context_with_model: graph splits = 1
co…
-
Some Intel Xeon server CPUs (for example _Xeon Platinum 8171M_ or _Xeon Platinum 8272CL_) support VNNI instruction. Is this something which chould be used for better performance or it is not suited fo…
-
### Background and motivation
There already is support for AVX VNNI hardware instruction set with support for 128-/256-bit vectors and it would be good to have same support for 512-bit vectors. (ve…
-
The WASM [Relaxed SIMD](https://github.com/WebAssembly/relaxed-simd) instructions were stabilized in [Rust v1.82](https://blog.rust-lang.org/2024/10/17/Rust-1.82.0.html#stabilized-apis).
This inclu…
-
@tomaarsen
Just wanted to know if clip (text + image) embedding models will have an onnx quantized model? i tried finding it everywhere but had no luck. If it is there can you please point me to it?…
-
I am running ollama on i7-14700K, which supports AVX2 and AVX_VNNI, and a GeForce RTX 1060.
After reading #2205, I enable `OLLAMA_DEBUG=1` to check if ollama utilize AVX2 of this CPU. But unlike th…
-
I have a intel CPU that supports a number of AVX features, but most of them are not picked up when using ollama. Below is the llama.log file:
system info: AVX = 1 | AVX2 = 0 | AVX512 = 0 | AVX512_…