SYSTRAN / faster-whisper

Faster Whisper transcription with CTranslate2
MIT License
11.81k stars 983 forks source link

Log probability threshold is not met with temperature with Amharic language #600

Open asusdisciple opened 10 months ago

asusdisciple commented 10 months ago

I am trying to benchmark faster-whisper, whisper from open ai and insanely fast-whisper. So far whisper was way faster than faster-whisper so I took a look into the debug messages. I found a lot of these messages, but only for certain languages. To evaluate I use the Fleurs dataset. The errors appear predominantly in the amharic language audio files and some other languages.

Maybe you have an idea, why this happens. My code to call Faster Whisper:

pipe2 = WhisperModel("large-v2", device="cuda", compute_type="float16")

    def call_whisper(filenames, pipe):
          outputs1 = []
          for file in filenames:
              file, sr = librosa.load(file, sr=16000)
              file = librosa.util.normalize(file)
              o, info = pipe.transcribe(file, beam_size=beam)
              print("sentence transcibed")
              outputs1.append({"text": list(o)[0][4]})

      global outputs
      outputs = outputs1
t0 = timeit.timeit(
                stmt="call_whisper(filenames, pipe)",
                setup="from __main__ import call_whisper",
                globals={"filenames": filenames, "pipe": pipe2},
                number=1

            )

ERRORS:

DEBUG:faster_whisper:Log probability threshold is not met with temperature 0.0 (-1.257850 < -1.000000)
DEBUG:faster_whisper:Log probability threshold is not met with temperature 0.2 (-1.112095 < -1.000000)
DEBUG:faster_whisper:Log probability threshold is not met with temperature 0.4 (-1.222618 < -1.000000)
DEBUG:faster_whisper:Log probability threshold is not met with temperature 0.6 (-1.646218 < -1.000000)

As well as this one:

DEBUG:faster_whisper:Compression ratio threshold is not met with temperature 0.0 (22.200000 > 2.400000)
DEBUG:faster_whisper:Compression ratio threshold is not met with temperature 0.2 (22.200000 > 2.400000)
DEBUG:faster_whisper:Compression ratio threshold is not met with temperature 0.4 (28.956522 > 2.400000)
blackpolarz commented 10 months ago

Rather than errors, these are actually debug messages telling you that they are uncertain about the result of their output. You can just suppress them using by setting the warning level.

Purfview commented 10 months ago

That's why in benchmarks you should disable fallback.