-
Hi there I am following instructions to get CoreML working on Apple Silicon M1.
after I get everything going and trying to transcribe the jfk sample, I only get a wrong transcription:
```
[00:0…
-
# Prerequisites
Before submitting your question, please ensure the following:
- [x] I am running the latest version of PowerInfer. Development is rapid, and as of now, there are no tagged versio…
-
When running ```docker run localagi/gpt4all-cli:main repl``` I am getting this error:
```
Traceback (most recent call last):
File "/cli/app.py", line 118, in
app()
File "/cli/app.p…
fogs updated
6 months ago
-
I have downloaded GPT-1.3B from Cerebras here https://huggingface.co/cerebras/Cerebras-GPT-1.3B
After converting the model weights to GGML with
```
python3 ./examples/gpt-2/convert-cerebras-to-gg…
-
I tried running whisper.cpp on a Thinkpad X220 today, and the program crash with SIGILL.
Is there some inline assembly assuming a newer CPU?
```
% gdb bin/main
[...]
(gdb) run -m model-medium…
-
### What is the issue?
I have some issues to compile latest ollma on an ARM nvidia jetson plattform cuda version is 11.2 with jetpack 5.1.2.
ggml-quants.c: In function ‘ggml_vec_dot_q4_0_q8_0’:
…
-
- [x] 解决当前运行时的错误。
- [x] 解决推理精度问题[LLAMA系FP16精度问题已解决]。
- [x] 使用内存池管理临时使用的dev或者host内存(buddy system)。还需要考虑及时释放已经使用完的内存(stream 同步时释放, dst重用时释放,需要优化成更优雅的方式),避免OOM。目前支持单卡,如果内存存在问题,考虑是否先同步运行。
- [ ] 自定义算子,使…
-
Been following along with your speed increases on Whisper using ggml, which have been amazing
Would be interesting to see how stable diffusion runs on CPUs using ggml
Here are current benchmarks…
-
I would love to see MLIR support. MLIR implements a vulkan runner built in as well as a SPIRV cpu runner. It seems like it was up-voted but I don't see any discussion on why CUDA or OpenCL was added t…
-
Got this while running from main branch in Podman AI Lab:
```
�llama_model_loader: loaded meta data with 25 key-value pairs and 291 tensors from /granite-7b-lab-Q4_K_M.gguf (version GGUF V3 (lates…