alesaccoia / VoiceStreamAI

Near-Realtime audio transcription using self-hosted Whisper and WebSocket in Python/JS
MIT License
736 stars 106 forks source link

Prioritize GPU Over CPU for Initial Loading of OpenAI Whisper Model #24

Open jinmiaoluo opened 5 months ago

jinmiaoluo commented 5 months ago

Loading the OpenAI Whisper model into the GPU firstly instead of RAM using CPU.

Currently, our app only loads the OpenAI Whisper model into RAM using the CPU, instead of primarily using the GPU.