Closed YHSI5358 closed 1 month ago
At this point, it’s not possible. The turbo model is only for HF transformers. I removed transformers support due to its erratic performance and out of memory errors. When the model is released without transformers, it can be added.
I tried the transformers branch again and am still seeing similar issues. Transformers requires large amounts of memory to batch correctly, and too much user configuration to get beam_size correct to not give OOM errors. My limited testing on my 8gb card, again shows that it performs worse than the current setup with subgen. If you want to squeeze more out of your card, you can try to run multiple transcriptions by bumping up your concurrent transcriptions variable.
This is the latest and seemingly most powerful model released some time ago. I hope it can be introduced to make everyone more satisfied with it.