jianfch / stable-ts

Transcription, forced alignment, and audio indexing with OpenAI's Whisper
MIT License
1.61k stars 178 forks source link

Can stable-ts be optimized for multi-CPU and multi-threading? #134

Closed zxl777 closed 1 year ago

zxl777 commented 1 year ago

I found that there is not much difference in the results when running on 5-core, 8-core, and 1-core CPUs. It seems that the computing power of some CPUs has not been fully utilized.

If audio could be partitioned for processing, perhaps it could be several times faster.

By the way, I want to compliment that this project is great. I have used the custom sentence segmentation and merging methods, and it's very useful. Thank you.

jianfch commented 1 year ago

Have you tried to specify the threads with pytorch?

torch.set_num_threads(5)
zxl777 commented 1 year ago

I tried it (torch.set_num_threads(5)) and the processing time didn't decrease.

jianfch commented 1 year ago

PyTorch uses all the available cores by default. It is slower when I set threads to a lower count so torch.set_num_threads is working on my end. Batch processing is not supported but you can try something similar to https://github.com/openai/whisper/discussions/153#discussioncomment-3746713.

zxl777 commented 1 year ago

Thank you, I will use other multithreading solutions to address this issue. Using more virtual machines would also work the same.