scaryrawr / vtt-generator

Generates VTT files for videos using ffmpeg and azure cognitive services
MIT License
3 stars 0 forks source link

Some part of transcript skipped in the VTT file #1

Open Invincible166 opened 4 years ago

Invincible166 commented 4 years ago

There is a small issue in this code on this line display_words = result['DisplayText'].split(' ')

There could be words like "person's" in the transcript. Here Azure would return this word in 2 parts in the "words" list as person and 's. The indexes of display_words would not be in sync with words list. Hence instead use this:

display_words = transcript_obj['NBest'][max_confidence_index]['Lexical'].split(' ')

This would solve the problem. Formatting could be missing but that can be added as additional code.

scaryrawr commented 4 years ago

Awesome catch! Thanks!