Closed DamianB-BitFlipper closed 10 months ago
The demo was recorded on a 4090, we did tested it on a 3090 as well but the latency is not as low. If you want to replicate it quickly, you can get a 4090 on e.g. https://cloud.vast.ai/
I see. The L4 that I'm renting from Google Cloud Compute is about the speed of the 3090 (a bit slower actually 31TFLOPS vs 35TFLOPS). I'll see if I can run this on something beefier.
I can also open a small PR tomorrow which edits the README that says the recommended hardware is a GPU at least as fast as the 4090.
If you can open a PR with the change, that would be great.
@zoq can you please share the steps to deploy this on vast.ai? I'm using Nvidia and hash cat image but it's not letting me use docker by sudo docker
I think that this is to be deployed using templates, what are the steps to create a template?
Hi WhisperFusion. Your demo is really quite shockingly good.
I am trying to replicate that, but my system is rather slow and laggy. I suspect it is because the GPU I am running the Whisper LLM is not powerful enough. I am currently renting a single L4 GPU.
What are your hardware recommendations to run this model?
Also I'd recommend adding this information to the README for others' future reference. Thanks!