You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. #253
I am using wsl. i have cuda 12.4 installed. I did: pipx runpip insanely-fast-whisper install flash-attn --no-build-isolation
command insanely-fast-whisper --model-name "openai/whisper-large-v3-turbo" --flash True --file-name test.mp3. I got the same thing when i did not provide the model name.
I expected the model to be on the GPU? How do i fix this: You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU withmodel.to('cuda'). thank you.
I am using wsl. i have cuda 12.4 installed. I did:
pipx runpip insanely-fast-whisper install flash-attn --no-build-isolation
command
insanely-fast-whisper --model-name "openai/whisper-large-v3-turbo" --flash True --file-name test.mp3
. I got the same thing when i did not provide the model name.I expected the model to be on the GPU? How do i fix this:
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with
model.to('cuda').
thank you.