Temperature changes are propagated to good portions of batches

It looks to me like a single compression_ratio or avg_logprob which fails a threshold check causes the entire batch to have temperature incremented and be re-run with the higher temperature.

As batch_size increases I believe this makes it more likely that a single segment result with a parameter out of bounds will cause the entire batch to be reevaluated with higher temperature. With large enough batch sizes this may create a kind of toggling where temperature rapidly rises to 1.0 (or max) since the higher temperatures may create worse compression_ratio or avg_logprob in another segment in the batch?

I wonder if there's an efficient way to retain the good segments and only re-run the failed segments? The entire inference is re-run against the batch currently so it should be maximally inefficient right now - is there any reason in principle the re-run couldn't be with the smaller batch_size of eg just 1?

Blair-Johnson / batch-whisper

Temperature changes are propagated to good portions of batches #8