ggerganov / whisper.cpp

Port of OpenAI's Whisper model in C/C++
MIT License
33.28k stars 3.35k forks source link

Python bindings (C-style API) #9

Open ArtyomZemlyak opened 1 year ago

ArtyomZemlyak commented 1 year ago

Good day everyone! I'm thinking about bindings for Python.

So far, I'm interested in 4 functionalities:

  1. Encoder processing
  2. Decoder processing
  3. Transcription of audio (feed audio bytes, get text)
  4. 3+Times of all words (feed audio bytes, get text + times of each word). Of course, it’s too early to think about the times of words, since even for a python implementation they are still not well done.

Perhaps in the near future, I will try to take up this task. But I had no experience with python bindings. So, if there are craftsmen who can do it quickly (if it can be done quickly... 😃), that would be cool!

carloscdias commented 1 year ago

Most python bindings I found in the last week were outdated or breaking with the current API, so I made a project (https://github.com/carloscdias/whisper-cpp-python) following the same pattern in ggerganov original answer and also followed his suggestion on providing a way to automatically generate the python bindings from whisper.h , I plan to provide an interface compatible with official whisper clients, similar to what was done in https://github.com/abetlen/llama-cpp-python , for my own use, but if it proves useful for anyone else, feel free to give it a try

silvacarl2 commented 1 year ago

thank you!! checking it out!

hoonlight commented 1 year ago

Most python bindings I found in the last week were outdated or breaking with the current API, so I made a project (https://github.com/carloscdias/whisper-cpp-python) following the same pattern in ggerganov original answer and also followed his suggestion on providing a way to automatically generate the python bindings from whisper.h , I plan to provide an interface compatible with official whisper clients, similar to what was done in https://github.com/abetlen/llama-cpp-python , for my own use, but if it proves useful for anyone else, feel free to give it a try

That's great! This should be added to https://github.com/ggerganov/whisper.cpp#bindings.

benniekiss commented 9 months ago

Looking around at the available python bindings, none currently seem to support the latest branch of whisper.cpp with GPU acceleration for cuda or metal. Does anyone have a working version? A lot has changed in whisper.cpp, and it seems most of the python bindings are based on an older version that lacks a lot of the more recent functions.

albcunha commented 8 months ago

Looking around at the available python bindings, none currently seem to support the latest branch of whisper.cpp with GPU acceleration for cuda or metal. Does anyone have a working version? A lot has changed in whisper.cpp, and it seems most of the python bindings are based on an older version that lacks a lot of the more recent functions.

I´m on the same boat. I can run whisper.cpp with rocm on the cli, but I keep gettint segmentation fault or other type of crash on all wrappers I saw.
The only that that was a complete failure was the code from synesthesiam above. But i just returned empty for me. I checked and recheckes whisper_full_params, which seem differente on the rocm build, but it does not work.

It´s now whisper.cpp fault. Let´s hope someone comes up with help.

albcunha commented 8 months ago

Just to give some feedback, I wanted to try whisper.cpp because I'm using an amd rx 5700 xt with 8gb vram. I wanted to use whisper large model. I endeded using hugging face transformers and i could fit this model on the gpu.

dnhkng commented 8 months ago

Disappointing there are so many unmaintained Python bindings.

Update: As I needed this on CUDA, I've tried to fix it myself. Seems to work OK on Ubuntu+Nvidia GPU, but not yet on Mac. Please test on Windows and report back! Code is on my PR: #1524

chrisspen commented 6 months ago

Unmaintained is putting it mildly. Even the ones that work don't work well. Every single Python binding of the C++ implementation I've tested is significantly slower than the pure-Python version, which is mind boggling.

Terrible implementations like whisper-cpp-python, which doesn't even publish its code anywhere, takes 5 minutes to transcribe a 10 second file that the pure Python implementation can handle in a few seconds using the same large model...

dnhkng commented 6 months ago

@chrisspen Did you try my PR? I'm using it for a real-time LLM chatbot. Using distil-whisper, I can get Voice->text and then text->voice in a few hundred ms.

chrisspen commented 6 months ago

@dnhkng Yes. The problem seems to be the C++ code. Might work fine with a gpu, but on cpu, it runs slower than pure Python on a 10 year old machine. And if C++ needs an expensive gpu to be faster than Python on a cpu, it's not good code.

I'm finding faster_whisper is much more usable for Python and far more cost effective.

egfthomas commented 6 months ago

@ArtyomZemlyak First you reinvent the pytorch functions in c, then you want python bindings around them. Isn't the end result the same as what we have in pytorch?

Is there a streaming function in the original python/pytorch implementation ?

SeeknnDestroy commented 2 months ago

@dnhkng Yes. The problem seems to be the C++ code. Might work fine with a gpu, but on cpu, it runs slower than pure Python on a 10 year old machine. And if C++ needs an expensive gpu to be faster than Python on a cpu, it's not good code.

I'm finding faster_whisper is much more usable for Python and far more cost effective.

can I use faster_whisper for real time transcription tasks?

chrisspen commented 2 months ago

@SeeknnDestroy

can I use faster_whisper for real time transcription tasks?

Probably not. faster_whisper is a lot faster than the pure Python implementation, but a lot slower than this C++ version.

I'd only recommend faster_whisper when you want good performance but don't have a GPU needed to run whisper.cpp.

hboehmer868 commented 1 month ago

After some struggle with the python bindings documented in the README and also trying whisper-cpp-python to no success, I landed on pywhispercpp. Might be worth adding to the list in README @ggerganov

BBC-Esq commented 1 month ago

After some struggle with the python bindings documented in the README and also trying whisper-cpp-python to no success, I landed on pywhispercpp. Might be worth adding to the list in README @ggerganov

I agree. I tested it out and it works alright, but it doesn't have gpu acceleration yet. The maintainer said it's just a time commitment thing, which I can understand. Would love to get some python bindings from somewhere that also support gpu so I can do some more benchmarking.

hboehmer868 commented 1 month ago

@BBC-Esq I have gotten pywhispercpp to run with gpu support. You can clone it from source and build it with CUDA support enabled, just like you do with whisper.cpp itself. I am gonna warn you directly that there are some issues with installing directly from source as you can read in my Issue over there.

Here is how I currently do it:

# Clone from source
git clone --branch v1.2.0 --recurse-submodules https://github.com/abdeladim-s/pywhispercpp.git
# Build a python wheel with CUDA support
# FYI: The submodule whisper.cpp in pywhispercpp is currently pinned at version 1.5.4,
# which still uses the old cmake flag for CUDA support, later versions use -DWHISPER_CUDA=1
cd pywhispercpp
CMAKE_ARGS="-DWHISPER_CUBLAS=1" python3 -m build --wheel
# Install the wheel into your python environment
python3 -m pip install dist/pywhispercpp-*.whl