How to use only CPU for inference? Please provide an example Thank you

sanchit-gandhi / whisper-jax

JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.

Apache License 2.0

4.36k stars 369 forks source link

How to use only CPU for inference? Please provide an example Thank you #91

Open mdys opened 1 year ago

mdys commented 1 year ago

I only have CPU now. How can I use only CPU for this project? I encountered an error using the README.md code!

mdys commented 1 year ago

from whisper_jax import FlaxWhisperPipline import jax.numpy as jnp pipeline = FlaxWhisperPipline("openai/whisper-large-v2", batch_size=16) outputs = pipeline("2.wav", task="transcribe", return_timestamps=True)

python test.py 2023-05-11 18:10:48.898676: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.) Killed

flexchar commented 1 year ago

Based on experiments on M2 MacBook Pro this library performed super slow. I'd kindly suggest taking a look at https://github.com/ggerganov/whisper.cpp which runs at around 5-10x Apple Silicon (not using the CoreML models).

That being said, transcribing longer videos on TPU v3-8 on GCP I observed speeds of 100x

mdys commented 1 year ago

Thanks! I only have CPU now。。I need an example py code that can run smoothly using CPU. Can you help me？

qhgy commented 1 year ago

me too