stlukey / whispercpp.py

Python bindings for whisper.cpp
MIT License
214 stars 99 forks source link

ggml-large.bin doesn't exist anymore on hugginface #28

Open athoune opened 10 months ago

athoune commented 10 months ago

large model doesn't work :

>>> w = Whisper('large')
Downloading ggml-large.bin...
whisper_init_from_file_no_state: loading model from '/Users/mlecarme/.ggml-models/ggml-large.bin'
whisper_model_load: loading model
whisper_model_load: invalid model data (bad magic)
whisper_init_no_state: failed to load model

You have to pick v1, v2 or v3.

See https://huggingface.co/ggerganov/whisper.cpp/tree/main

micseydel commented 2 months ago

Thank you, that helped!

>>> import whispercpp 
>>> whispercpp.MODELS["ggml-large-v3.bin"] = "https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-v3.bin"
>>> w_large = whispercpp.Whisper('large-v3')
Downloading ggml-large-v3.bin...
whisper_init_from_file_no_state: loading model from '/Users/micseydel/.ggml-models/ggml-large-v3.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51866
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 1280
whisper_model_load: n_audio_head  = 20
whisper_model_load: n_audio_layer = 32
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 1280
whisper_model_load: n_text_head   = 20
whisper_model_load: n_text_layer  = 32
whisper_model_load: n_mels        = 128
whisper_model_load: f16           = 1
whisper_model_load: type          = 5
whisper_model_load: mem required  = 3342.00 MB (+   71.00 MB per decoder)
whisper_model_load: adding 1609 extra tokens
whisper_model_load: model ctx     = 2951.32 MB
whisper_model_load: model size    = 2951.01 MB
whisper_init_state: kv self size  =   70.00 MB
whisper_init_state: kv cross size =  234.38 MB

It seems like I must be doing something wrong though still

>>> result = w_large.transcribe("/Users/micseydel/transcriptions/2024-08-10/Tom Froese and Michael Levin discuss Tom's Irruption theory.mp4")
Loading data..
Transcribing..
whisper_full_with_state: progress =   5%
whisper_full_with_state: progress =  10%
whisper_full_with_state: progress =  15%
whisper_full_with_state: progress =  20%
whisper_full_with_state: progress =  25%
whisper_full_with_state: progress =  30%
whisper_full_with_state: progress =  35%
whisper_full_with_state: progress =  40%
whisper_full_with_state: progress =  45%
whisper_full_with_state: progress =  50%
whisper_full_with_state: progress =  55%
whisper_full_with_state: progress =  60%
whisper_full_with_state: progress =  65%
whisper_full_with_state: progress =  70%
whisper_full_with_state: progress =  75%
whisper_full_with_state: progress =  80%
whisper_full_with_state: progress =  85%
whisper_full_with_state: progress =  90%
whisper_full_with_state: progress =  95%
whisper_full_with_state: progress = 100%
>>> text = w_large.extract_text(result)
Extracting text...
>>> len(text)
0
>>> type(result)
<class 'int'>
>>> result
0
>>> text
[]