linto-ai / whisper-timestamped

Multilingual Automatic Speech Recognition with word-level timestamps and confidence
GNU Affero General Public License v3.0
2.01k stars 156 forks source link

Update transcribe.py #95

Closed anita-arch closed 11 months ago

anita-arch commented 1 year ago

remove repeated over transcribe

Jeronymous commented 1 year ago

Thank you @anita-arch for opening a PR.

So you're changing default value of compression_ratio_threshold And you claim that it's solving https://github.com/linto-ai/whisper-timestamped/issues/94 ?

Can you tell a bit more, please? Is it a trick known in the "Whisper world"? Do you understand how compression_ratio_threshold works and can help here?

RaulKite commented 1 year ago

I have read the same here.

https://github.com/openai/whisper/discussions/192

Maybe it can be a parameter? Or a variable in python ?

Jeronymous commented 11 months ago

I am closing this PR, because I experienced a bit and 1) compression_ratio_threshold has little effect on looping 2) the value is an option that can still be changed (both trough the CLI and the python transcribe() function).

Thank you @RaulKite for the link!

Maybe it can be a parameter? Or a variable in python ?

It's already an input option