lowerquality / gentle

gentle forced aligner
https://lowerquality.com/gentle/
MIT License
1.41k stars 292 forks source link

Bug Report: Déjà Vu in Auto Transcripts #86

Open natelawrence opened 7 years ago

natelawrence commented 7 years ago

I don't know if it's a Kaldi setting, bug, or something else, but since I have recently been testing giving Gentle a media file with no transcript, I've noticed that for a single segment of audio, there is often two sections of text which appear to try to describe that single segment on the timeline.

For example, on alignment 6afd2a94 toward the very beginning, we can hear:

Do we need to cite, like, uh retrogaming.tv or whatever?

And Kaldi/Gentle hands back:

do we need to buy like a preacher gaming got t._v. or a richer gaming got t._v. or whatever

In this example, the phrases: preacher gaming got t._v. and richer gaming got t._v. both appear to be alternative attempts to transcribe retrogaming.tv.

I could cite (many) more examples, but perhaps the above is expected behavior?

It's just a bit jarring when watching Gentle's interactive transcript.

Also, I would think that Gentle would have trouble trying to align two consecutive alternate transcriptions of a single passage of audio to that single section of time.

natelawrence commented 6 years ago

For reference, this may have been the same issue addressed in #107 and #108.

It is difficult to say with any certainty because the only way I currently have to test Gentle is the Docker container which is not currently generating automated transcripts. #145