As discussed at TPAC 2024-09-26 (please add minutes link when minutes have been prepared), add text track support to MSE.
Main use case is to simplify player code by allowing the same buffer pipeline management to be used for text tracks as for audio and video.
Suggestion to allow page code to pull the text content out of the buffer prior to presentation, so that different formats can be handled.
From the discussion there were multiple other concerns related to this, that would need to be addressed by specification changes, including:
Handling in-band subtitles and captions
Understanding/expressing the active time interval for a non-wrapped plain text cue sample, e.g. plain text VTT. May need an explicit data structure or other API when pushing those samples into the MSE buffer
Defining the model for handling changes in available text tracks, e.g. when ads are inserted, that have differing numbers of text tracks from the main programme
Defining the player behaviour if the text track buffer runs out of content, but the audio and video buggers are adequately populated
Action was to raise two issues - not clear to me that Inband vs Plain Text is the right split - I think the following possibilities need to be considered:
Subtitles and captions embedded in the A/V multiplex, possibly directly included in the encoded video, e.g. 608/708 (maybe WSTeletext too?)
Subtitles and captions provided in ISOBMFF or wrapped in MPEG2 TS with identifiable formats and sample intervals, e.g. TTML (IMSC or EBU-TT-D) and WebVTT
Subtitles and captions provided as plain text resources delivered via some separate distinct route than the audio and video, e.g. TTML (IMSC or EBU-TT-D) and WebVTT.
As discussed at TPAC 2024-09-26 (please add minutes link when minutes have been prepared), add text track support to MSE.
Main use case is to simplify player code by allowing the same buffer pipeline management to be used for text tracks as for audio and video.
Suggestion to allow page code to pull the text content out of the buffer prior to presentation, so that different formats can be handled.
From the discussion there were multiple other concerns related to this, that would need to be addressed by specification changes, including:
Action was to raise two issues - not clear to me that Inband vs Plain Text is the right split - I think the following possibilities need to be considered: