fedirz / faster-whisper-server

https://hub.docker.com/r/fedirz/faster-whisper-server
MIT License
743 stars 108 forks source link

faster-whisper-server output seems something wrong #160

Open burness opened 7 hours ago

burness commented 7 hours ago

After a certain segment, all subsequent recognized texts are incorrect:

image
from openai import OpenAI

client = OpenAI(api_key="cant-be-empty", base_url="http://192.168.31.100:8000/v1/")

audio_file = open("../../examples/test_02.mp3", "rb")
transcript = client.audio.transcriptions.create(
    model="Systran/faster-whisper-large-v3", file=audio_file
)
print(transcript.text)

I use the same file to transcript in faster_whisper, it seems ok

image
from faster_whisper import WhisperModel

model_size = "large-v3"

model = WhisperModel(model_size, device="cuda")
segments, info = model.transcribe("test_02.mp3", beam_size=5)

print("Detected language '%s' with probability %f" % (info.language, info.language_probability))

for segment in segments:
    print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))

Can anybody help me ?

burness commented 7 hours ago

It seems that the default temperature 0 cause this wrong. I change the 0 to 0.7 to solve it.

A small suggestion:

Change the default temperature 0 to 0.7.