Joll59 / d-ser-t

d-ser-t quantifies speech recognition accuracy of the MSFT speech service and/or user created MSFT custom speech service models.
2 stars 2 forks source link

Bug fix for empty transcription results #86

Closed zanawar closed 4 years ago

zanawar commented 4 years ago

Bug fix for audio files which have invalid (or empty) transcription preceding valid segments of speech.

This happened in cases when a file has some background noise or dialogue, followed by some static or silence, followed by the actual audio/dialogue we want to transcribe. CRIS was returning a transcription for the background audio which had <5% confidence resulting in an empty/blank transcription.