Some part of transcript skipped in the VTT file

There is a small issue in this code on this line display_words = result['DisplayText'].split(' ')

There could be words like "person's" in the transcript. Here Azure would return this word in 2 parts in the "words" list as person and 's. The indexes of display_words would not be in sync with words list. Hence instead use this:

display_words = transcript_obj['NBest'][max_confidence_index]['Lexical'].split(' ')

This would solve the problem. Formatting could be missing but that can be added as additional code.

scaryrawr / vtt-generator

Some part of transcript skipped in the VTT file #1