Blair-Johnson / batch-whisper

Batch Support for OpenAI Whisper
MIT License
82 stars 21 forks source link

Temperature changes are propagated to good portions of batches #8

Open sbuser opened 1 year ago

sbuser commented 1 year ago

It looks to me like a single compression_ratio or avg_logprob which fails a threshold check causes the entire batch to have temperature incremented and be re-run with the higher temperature.

As batch_size increases I believe this makes it more likely that a single segment result with a parameter out of bounds will cause the entire batch to be reevaluated with higher temperature. With large enough batch sizes this may create a kind of toggling where temperature rapidly rises to 1.0 (or max) since the higher temperatures may create worse compression_ratio or avg_logprob in another segment in the batch?

I wonder if there's an efficient way to retain the good segments and only re-run the failed segments? The entire inference is re-run against the batch currently so it should be maximally inefficient right now - is there any reason in principle the re-run couldn't be with the smaller batch_size of eg just 1?

Blair-Johnson commented 1 year ago

We should be able to track which DecodingResults failed and which ones succeeded and re-run only the failed segments. We could either re-run only the failed segments within transcribe_with_fallback, which would still block further pipeline execution for the rest of the batch, or we could track fallback and temperature on a per-audio basis external of tanscribe_with_fallback. The latter one seems like it could be faster, but the first one could be an easy enough first step.