Most of the usage is in the CPU and not in the GPU

yakovw commented 1 year ago

I tested the new version, and most of the usage is on the CPU, only occasionally it uses the GPU for a moment, I don't know why not all the usage is on the GPU, which would be much faster? You can also see here If you use what he did, it is only on GPU and works very fast https://github.com/Const-me/Whisper Thanks for everything, it definitely improved performance, but not as much as I expected

sandrohanea commented 1 year ago

Hello @yakovw and thanks for the feedback,

Indeed, I observed that CPU is intensively used still, but this is true for whisper.cpp as well: https://github.com/ggerganov/whisper.cpp#nvidia-gpu-support-via-cublas

Whisper.net is just a wrapper over the native whisper.cpp, while https://github.com/Const-me/Whisper is a port of whisper.cpp with Directx support.

For now, only the decoder is offloaded to the GPU, while the melspectogram and decoder are still running on CPU (same as in the whisper.cpp).

If whisper.cpp will be improved to offload more stuff to GPU, newer versions of whisper.net will automatically get all these improvements.

Will close this issue for now, as the concern is with the native whisper.cpp, not with this wrapper.

yakovw commented 1 year ago

I'm good, thank you very much Indeed, without a doubt, it is an improvement in every way

sandrohanea / whisper.net

Most of the usage is in the CPU and not in the GPU #79