Improving transcription segmentation to identify end of sentences

Currently, the transcript is segmented into sequential segments of text by processing the .srt file. This means that there is a good chance that these segments might not be cutting off at the end of a speaker's sentence, but only at the end of a word mid-sentence, which would be abrupt if a question were to be inserted at this point.

This can be improved by processing the text to identify where the end of the sentence (EoS) actually lies. This might not be as simple as it depends on the transcription quality and the nature of the speaker's presentation style.

A possible means to address this issue will be to implement some kind of Part-of-Speech (POS) tagging mechanism to find the EoS in the transcript text. This can be through simpler Natural Language Processing techniques, or via a BERT-based model as well.

tl-its-umich-edu / annoto-gai

Improving transcription segmentation to identify end of sentences #1