Open mariano54 opened 2 months ago
I am performing a large number of transcriptions on limited GPU space.
I would like to cancel the model forwarding as soon as I know that I won't need the result. Is it possible to do this with faster-whisper?
If not, can you help guide me to where I'm Ctranslate2 I would need to modify to add this feature?
Thank you for your great work.
you can use multithreading, one thread per transcription using the same model instance, and kill the thread when you don't need it anymore
I am performing a large number of transcriptions on limited GPU space.
I would like to cancel the model forwarding as soon as I know that I won't need the result. Is it possible to do this with faster-whisper?
If not, can you help guide me to where I'm Ctranslate2 I would need to modify to add this feature?
Thank you for your great work.