Vaibhavs10 / insanely-fast-whisper

Apache License 2.0
7.71k stars 542 forks source link

You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. #253

Open solarslurpi opened 1 day ago

solarslurpi commented 1 day ago

I am using wsl. i have cuda 12.4 installed. I did: pipx runpip insanely-fast-whisper install flash-attn --no-build-isolation

command insanely-fast-whisper --model-name "openai/whisper-large-v3-turbo" --flash True --file-name test.mp3. I got the same thing when i did not provide the model name.

I expected the model to be on the GPU? How do i fix this: You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU withmodel.to('cuda'). thank you.

iwyc0 commented 13 hours ago

same problem