edgenai / edgen

⚡ Edgen: Local, private GenAI server alternative to OpenAI. No GPU required. Run AI models locally: LLMs (Llama2, Mistral, Mixtral...), Speech-to-text (whisper) and many others.
https://docs.edgen.co/
Apache License 2.0
328 stars 15 forks source link

bug: audio transcriptions fails with "failed to initialize the whisper context" #55

Closed prabirshrestha closed 6 months ago

prabirshrestha commented 7 months ago
curl http://localhost:33322/v1/audio/transcriptions   -H "Authorization: Bearer no-key-required"   -H "Content-Type: multipart/form-data"   -F file="@/Users/prabirshrestha/Downloads/frost.wav"   -F model="default"

log:

2024-02-12 23:43:51.354 Edgen[19183:19652515] WARNING: Secure coding is not enabled for restorable state! Enable secure coding by implementing NSApplicationDelegate.applicationSupportsSecureRestorableState: and returning YES.
^L2024-02-13T07:44:02.282828Z ERROR whisper_cpp::internal: ggml: whisper_model_load: tensor 'encoder.conv1.weight' has wrong shape in model file: got [80, 768, 1], expected [3, 80, 768]
2024-02-13T07:44:02.282846Z ERROR whisper_cpp::internal: ggml: whisper_init_with_params_no_state: failed to load model

config:

audio_transcriptions_models_dir: /Users/prabirshrestha/code/llm/audio_transcriptions
audio_transcriptions_model_name: ggml-distil-small.en.bin
audio_transcriptions_model_repo: distil-whisper/distil-small.en

OS: MacOS M3, Sonoma 14.2.1

pedro-devv commented 6 months ago

We have updated whisper.cpp's version, please let me know if it still doesn't work. I can't reproduce this bug in Windows nor in Linux, and it doesn't seem to be something on our side. Unfortunately we're lacking Mac hardware, so I can't test there.

prabirshrestha commented 6 months ago

Would be great if there is a nightly builds so I can try it easily without building from scratch.

prabirshrestha commented 6 months ago

I have verified this works in Mac. thanks for the fix!