sandrohanea / whisper.net

Whisper.net. Speech to text made simple using Whisper Models
MIT License
534 stars 82 forks source link

Memory required with model medium and large #17

Closed sondt closed 1 year ago

sondt commented 1 year ago

I'm downloaded model Medium and Large at https://ggml.ggerganov.com/ After run: whisper_init_from_file: loading model from 'ggml-model-whisper-medium.bin' whisper_model_load: loading model whisper_model_load: n_vocab = 51865 whisper_model_load: n_audio_ctx = 1500 whisper_model_load: n_audio_state = 1024 whisper_model_load: n_audio_head = 16 whisper_model_load: n_audio_layer = 24 whisper_model_load: n_text_ctx = 448 whisper_model_load: n_text_state = 1024 whisper_model_load: n_text_head = 16 whisper_model_load: n_text_layer = 24 whisper_model_load: n_mels = 80 whisper_model_load: f16 = 2 whisper_model_load: type = 4 whisper_model_load: mem required = 1720.00 MB (+ 43.00 MB per decoder) whisper_model_load: adding 1608 extra tokens whisper_model_load: model ctx = 1462.35 MB

sandrohanea commented 1 year ago

The memory requirements should be the same as whisper.cpp (Whisper.net is just calling that under the hood). Ofc, there will be some extra mem used by dotnet runtime (GC and everything), but that shouldn't be signifiant in this case.

You can find the requirements here: https://github.com/ggerganov/whisper.cpp#memory-usage