parse subtitle files during ingest and extract text lines and corresponding times (start times should be enough for search, but one might as well get both)
store resulting text/time snips in a model (Media –– 1:n–– SubtitleFile –– 1:n –– SubtitleLine).
include data in fulltext-search (timecode can be used to navigate to the appropriate position in the media file upon retrieval)
Basic parsing of SRT/VTT files is trivial; there are also a number of libraries available providing more functionallity:
We will need to index subtitle files at some point, but we probably should not add those to the database as all the data needed is already available in an accessible text format.
Basic parsing of SRT/VTT files is trivial; there are also a number of libraries available providing more functionallity: