classtranscribe / WebAPI

Repository for the .NET Core backend for ClassTranscribe
Other
16 stars 3 forks source link

More sophisticated caption generation #76

Open angrave opened 4 years ago

angrave commented 4 years ago

Add more "smarts" to caption generation

e.g. New Sentences should usually start a new caption line.

Beware of end-of-caption edge cases (there are many...)

See https://github.com/classtranscribe/WebAPI/compare/MSToVtt which was based on convert Angrave's word-to-captions python code. See heuristics here-

https://github.com/classtranscribe/PythonTools/blob/master/transcribe-cli/ms_json_to_caption.py

angrave commented 4 years ago

Partially addressed for main language transcription in MSTWord.cs See commit 3f492973c08535d22923737c6900f6f01db29e58#diff-2a9500e6cd909d0a9f9f24e512c678b0