Open duwenlong2 opened 8 months ago
The library is identifying one language at stat-up (if "auto" is used) and then it is used to transcribe the entire file so it makes sense it will uniformly output English (and transcriptions when other languages are spoken).
One idea to fix it (but it is not tested) would be to :
WithProbabilities
on the builder => which will give you the confidence level for each segment.Replace the segments in the result.
It would be probably interesting to have this functionality in the library in the future, but cannot promise that I'll have time to implement it.
I have a scene like this. The audio file for meeting minutes needs to be converted into text, but after using Whisper, there are Chinese and English sentences in my audio file. He has uniformly output English. I want to preserve the results of the original language. Can it be implemented in Whisper?