Closed jiawade closed 1 year ago
There is no direct option for splitting the load between GPU and CPU. The faster-whisper models have smaller memory footprints. Its large-v2
in FP16 takes up less than 5GB. You can install faster-whisper then use it with stable-ts.
model = stable_whisper.load_faster_whisper('large-v2', device="cuda", compute_type="float16")
result = model.transcribe_stable(file, language='en')
@jianfch thanks
I'm trying to use gpu to transcript: model = stable_whisper.load_model('large-v2').cuda(0) result = model.transcribe(file, language='en') but after running above code it will rase fowling error: CUDA out of memory. Tried to allocate 26.00 MiB (GPU 0; 6.00 GiB total capac My graphics card is NVIDIA 2060,6G, can I add a parameter to use a part of the GPU for acceleration for video transcript, that is, the CPU and GPU work at the same time, thanks.