Closed palemieux closed 2 years ago
Text display is currently handled by the browser or the application. Browsers do not currently support regions, but we do parse them and make the information available to text display plugins.
Adding this to the backlog. We will consider a complete DOM-based implementation of TextDisplayer.
In the mean time, feel free to write your own plugin for this. We would be happy to review a pull request if you decide to work on this.
@joeyparrish Thanks for the feedback. Is the following possible for a plug-in to:
div
overlaying the video object)?At each callback, the TTML plugin would draw the subtitle/captions in the target rendering DOM element.
It's not a TTML plugin, but rather a text display plugin. It receivers cues which it is responsible for rendering. Here's the interface documentation:
https://shaka-player-demo.appspot.com/docs/api/shakaExtern.TextDisplayer.html
Here's the documentation for the cue objects:
https://shaka-player-demo.appspot.com/docs/api/shaka.text.Cue.html
This system is more general than TTML. It will be used by the player for all subtitle/caption rendering, regardless of the input format.
You can pass whatever state you want into the constructor or into additional methods you have on your class.
Ok. Thanks for the details. As I understand it,
Text Cue
instances are generatedText Cue
instances are passed to a Text Displayer plugin for renderingIs that right? If so, what happens if the current Text Cue
interface does not support the full range of capabilities offered by TTML? Can the media track parser return any object as long as it implements the Text Cue
interface? Will the full object be passed back to the plugin?
I am thinking the media track parser could return a TTML cue that implements the current interface (for minimal compatibility) but also contains the information (as private members) needed to fully render the TTML cue. Would that work?
Thanks for your help. Happy to continue the discussion offline. Feel free to DM me at pal@sandflow.com.
The shaka.text.Cue
class is owned by us, and is independent of the browser. If it is missing some TTML feature you need, we can extend it. We intend it to be generic enough to support both TTML and WebVTT.
The exact Cue
object (as output by the TTML parser) will be sent to the TextDisplayer plugin. If there is something missing in that object, we would be happy to take either a feature request or pull request to add fields to Cue
and add parser support for them in the TTML parser.
TTML supports capabilities beyond what shaka.text.Cue
supports, including:
Couple of options:
shaka.text.Cue
to support all these variationsshaka.text.Cue
to carry an HTML fragment that would be generated by the TTML parsershaka.text.Cue
to carry a TTML-specific object that would be generated by the TTML parser, and be available to the TTML rendering pluginimscJS can readily support the second and third options.
Thoughts?
Makes sense to allow the Cue
object to support richer definition of payload content than it does now. Maybe even by allowing Cue.payload
to be a Cue
itself, or if you don't want to support infinite nesting, then by including some kind of markup within the payload
that would override the appropriate properties like fontStyle
, fontWeight
, color
etc. for a subsection of text within them.
Some approach like that will be required to support TTML properly since it allows style attributes to be specified on spans within what are considered "cues" here.
First, let me address using imscJS for TTML support:
Our current TTML parser, limited though it may be, compiles to about 7,939 bytes. imscJS + its sax dependency is 159,787 bytes. That's a 20x increase in size for the TTML parser itself. Shaka Player as a whole is currently 185,141 bytes (in my working directory, anyway), so adding imscJS + sax would be an 86% increase in the size of Shaka Player itself.
That is way too big for a built-in subtitle parser. If you want to use imscJS to handle TTML, you can always have a text-parsing plugin at the application level which replaces our default TTML parser. See the API docs on TextParser if you want to pursue that: https://shaka-player-demo.appspot.com/docs/api/shakaExtern.TextParser.html
As for enriching our existing TTML parser, I'm completely open to that. But since it sounds like our Cue
interface is not up to the task of full TTML support, it will need to be redesigned. If you want to contribute in this area, I would prefer to get a design proposal for Cue
and iterate on that before you make a complete pull request. If you don't want to do this, my team will look into it once we have taken care of some higher-priority features. For now, this is still in the "backlog" milestone for us.
Thanks!
That is way too big for a built-in subtitle parser. If you want to use imscJS to handle TTML, you can always have a text-parsing plugin at the application level which replaces our default TTML parser.
Seems a reasonable approach: have the application provide text parser and cue renderer implementations, e.g. using imscJS. Perhaps a simple example is all that is required.
@joeyparrish P.S.: I very much appreciate your taking the time to explore these various options.
This issue is quite old, and our TTML and rendering capabilities have grown a lot. If there are still gaps in our support of TTML parsing or rendering, please feel free to open new issues for them. Thanks!
The captions in the following MPD should be presented in a region that starts 30% for the left and ends 10% to the right, and the font size should be 1/30 of the height of the video.
https://palemieux.com/public/foms2017/CEP150_512kb.mpd
Happy to provide additional information.
P.S.: consider using the W3C IMSC1 test suite to validate IMSC1 rendering, or using the imscJS polyfill, which is also used by dash.js.