Open chandan0000 opened 1 month ago
Same here. I have a running faster-whisper-server
on another machine and am looking at how to integrate with it, instead of Whisper, as in this example. This will off-load the my GPU (3060ti ~7.5GB) and give me a chance to test the setup.
@AlexHayton @andimarafioti @eustlb @Vaibhavs10 : any hints of how I can point to my running STT faster-whisper-server
?
Hi! 👋 Sorry it took me a few days to get here. I would start by trying this out with a smaller LLM, since that is the part that is failing. Why don't you try passing --lm_model_name HuggingFaceTB/SmolLM-360M-Instruct
to the python s2s_pipeline.py
call?
@andimarafioti i think vram issue i am running this model --lm_model_name HuggingFaceTB/SmolLM-360M-Instruct again same error
oh yeah, that card has only 6gb of vram, you're brave 😅 If you manage to get all of this models to run with that little vram that would be great for the library, please report how you did it!
my system config