aarnphm / whispercpp

Pybind11 bindings for Whisper.cpp
Apache License 2.0
322 stars 61 forks source link

bug: Runs Exclusively on CPU #173

Open pkreissel opened 11 months ago

pkreissel commented 11 months ago

Describe the bug

This binding is about 10 times slower than native Whisper CPP because it is running exclusively on CPU on my M2 Device. Whisper CPP runs fine on its own on the GPU, so there is no reason why this should not be possible for Python bindings.

To reproduce

I ran this code:

from whispercpp import Whisper

w = Whisper.from_pretrained("large")
transcript = w.transcribe_from_file("output.wav")

I compared with whisper cpp command: ./main -f output.wav -m models/ggml-large.bin -otxt

Expected behavior

Run on GPU and 10x faster

Environment

python 3.11 MacOS Sonoma M2

Jajcus commented 10 months ago

Strength of whisper.cpp comes with all the back-ends it can use (especially for non-nVidia GPU users – OpenVINO, OpenCL), unfortunately none of those seems to be supported in these bindings.