kibaffo33 / aws_transcribe_to_docx

Produce Word Document, CSV or SQLite transcriptions using the automatic speech recognition from AWS Transcribe.
MIT License
163 stars 64 forks source link

alternative transcripts #15

Closed rastage closed 4 years ago

rastage commented 4 years ago

Hi, I found this an extremely useful tool once I figured out how to use Python. How does it handle the alternative transcripts that can be optionally added to a job? There may be times when the transcription with lower confidence level turns out to be more accurate.

kibaffo33 commented 4 years ago

Hi. Glad you found the tool useful.

tscribe should find the word with the highest confidence.

https://github.com/kibaffo33/aws_transcribe_to_docx/blob/58358cf402a9eeaa9ccf550f69fc3cbe9366ca4b/tscribe/__init__.py#L153-L172

There may well be times a lower confidence word is more accurate, but I don't think we would want to prompt the user the check them all. The docx output provides some highlighting, or rather making higher confidence words bold, which might assist you in correcting lower accuracy words. I suppose an alternative may be to compare to another tool or another transcription service...

Does this answer your question? Thank you

rastage commented 4 years ago

I guess it does, thanks for responding.