Closed rjksmith closed 3 years ago
@nigelmegitt @gkatsev I was asked to raise this PR to maintain VTTCue
alignment with the proposal for TextTrackCue.endTime = +Infinity
. The change was previously discussed in issue https://github.com/whatwg/html/issues/5297 and has now progressed to PR https://github.com/whatwg/html/pull/5953.
I now see that there's an IPR issue as I'm not a member of TTWG, though I am an IE in SDWIG and have been working with MTE TF for a while. There's no urgency at the moment, but I'd like to resolve this blocking issue and would welcome your guidance on how best to proceed. Thanks.
cc @chrisn
@rjksmith, index.html
is automatically generated from index.bs
, using Bikeshed. In other words, the PR should update index.bs
, not index.html
.
Regarding the IPR issue, an easy way to solve this would be to have you sign a non-participant contribution commitment.
Thanks @tidoust!
I think updating index.bs` would also trigger the PR Preview feature.
@tidoust @gkatsev Thanks for your guidance.
I've reverted my changes (7e4b64c) to index.html
and updated index.bs
to generate equivalent modifications with Bikeshed (17626fd).
I suspect that the Bikeshed tool has been updated since index.html
was last officially generated and I've split these changes into a separate commit (f0323cb) for clarity.
Please let me know the process to sign a non-participant contribution agreement to cover IPR so we can resolve the blocking issue. Thanks.
Error is from WebIDL change at https://github.com/heycam/webidl/pull/798, which is supposed to be fixed by https://github.com/w3c/webvtt/pull/492
@himorin let me know if you need any assistance with the non-participant contribution agreement process - it should be relatively well-identified starting from https://labs.w3.org/repo-manager/pr/id/w3c/webvtt/493
@himorin let me know if you need any assistance with the non-participant contribution agreement process
ah, yes. For now, this is kept stop for repo management side (reorg, webidl alignment...), but should be fixed shortly.
We've updated the main
branch, this PR would need to be updated/rebased against the latest changed there.
The Timed Text Working Group just discussed [WebVTT] Added unbounded TextTrackCue.endTime w3c/webvtt#493
, and agreed to the following:
SUMMARY: 1. Make sure we have tests for the change; 2. Enquire whether a syntax change is needed, and if so, investigate if it can be added in a non-breaking way.
WebVMT already handles unbounded cues as they are a common use case for live streaming.
The cue end time is simply omitted as in the Tower of London example. This has the advantage that an end time can easily be added to make the cue bounded when it becomes known at some future time.
@rjksmith yeah, I brought up how WebVMT does it. For WebVTT there the API and the format. This PR currently only changes the API for VTTCue, which is necessary for the programmatic metadata cues as notes in the examples. It probably makes sense for WebVTT to be able to represent it in its text format too, but then it must do so in a backwards compatible manner, which is what (2) in the summary above refers to. Though, I think a syntax change can come in separately from the API change.
For the tests, we'd want to make a PR against web-platform-tests to update the constructor to allow endTime to be Infinity https://github.com/web-platform-tests/wpt/blob/master/webvtt/api/VTTCue/constructor.html.
Oh, just realized that maybe TextTrackCue has tests too and sure enough it does. https://github.com/web-platform-tests/wpt/blob/master/html/semantics/embedded-content/media-elements/interfaces/TextTrackCue/endTime.html I'll add a comment on the html repo PR too.
Also, I just verified that adopting WebVMT's syntax for unbounded cues can happen in a backwards compatible way. Parsers that don't recognize a time header without an end time (i.e. all current webvtt parsers) ignore that entire cue block but still show all subsequent ones. Tested in Chrome, Firefox, Safari, and vtt.js.
Also, I just verified that adopting WebVMT's syntax for unbounded cues can happen in a backwards compatible way. Parsers that don't recognize a time header without an end time (i.e. all current webvtt parsers) ignore that entire cue block but still show all subsequent ones. Tested in Chrome, Firefox, Safari, and vtt.js.
@gkatsev Thanks for confirming. That sounds encouraging. I'll take a look at the web-platform-tests too.
@himorin let me know if you need any assistance with the non-participant contribution agreement process
ah, yes. For now, this is kept stop for repo management side (reorg, webidl alignment...), but should be fixed shortly.
Is there any news on this or can you post a link to the agreement? Thanks
@himorin let me know if you need any assistance with the non-participant contribution agreement process
ah, yes. For now, this is kept stop for repo management side (reorg, webidl alignment...), but should be fixed shortly.
Is there any news on this or can you post a link to the agreement? Thanks
I think you might have received an email from W3C non-participant contribution system at Mar 11, 2021, to an email address you registered to W3C account.
@gkatsev Thanks for confirming. That sounds encouraging. I'll take a look at the web-platform-tests too.
It would be interesting to know whether it's useful to have a way to represent unbound end-time in a way that old parsers handle things intelligently. But it may not be. I think this was discussed a bit at the M&EIG meeting that, unfortunately, had to miss.
Also, I just verified that adopting WebVMT's syntax for unbounded cues can happen in a backwards compatible way. Parsers that don't recognize a time header without an end time (i.e. all current webvtt parsers) ignore that entire cue block but still show all subsequent ones. Tested in Chrome, Firefox, Safari, and vtt.js.
I'm not convinced that skipping the entire cue counts as "backwards compatible". I'd suggest a more backwards compatible approach here is one where a cue end time is specified, that works for all parsers, and additional syntax is available, that is ignored by parsers that don't recognise it, and used by those that do, where that syntax has the semantic "set the end time to infinity".
Then authors can specify a fallback end time of their choosing.
I'd propose a new WebVTT Cue Setting to satisfy this, because unknown cue settings should be ignored by parsers that don't recognise them.
For example:
00:30.000 --> 99:59.999 end-time-override: infinity
This cue is supposed to last effectively forever.
Current parser generates a cue whose end time is 99:59.999
.
Parser that understands end-time-override: infinity
generates a cue whose end time is Infinity
.
I think it counts as backwards compatible, but it's definitely not the best behavior. I was verifying whether the current syntax that WebVMT used could be used and not wreak havoc on existing parsers, like having it fail to parse the rest of the file or something.
Behavior-wise, a new option definitely fits better with graceful degradation, which is definitely better than all or nothing. (For a similar reason, WebVTT really needs to add the rp
element for ruby parentheticals).
I think you might have received an email from W3C non-participant contribution system at Mar 11, 2021, to an email address you registered to W3C account.
@himorin Many thanks. I've found it.
Using unbounded cues is not compulsory. If backward compatibility is an issue then a long bounded cue can be used instead.
00:30.000 --> 99:59.999
This cue is supposed to last effectively forever.
This is fully backward compatible and is the current solution.
WebVMT also allows unbounded cues to be superseded which is an integral part of interpolation for live streaming use cases as discussed at TPAC 2020.
Using unbounded cues is not compulsory.
Indeed, but this issue is to allow the semantic of an unbounded cue to be expressed, right?
I think that Nigel's point is that it's hard to tell for all cases what the best backwards-compatible behavior should be:
However, I am not sure what the effect of (2) would be. Experiment is needed. It might cause players to conclude that the presentation really is 99 years long, and display e.g. a scroll-bar that long.
Yes I agree. However, displaying an unbounded cue indefinitely may not be the correct behaviour.
An unbounded cue is defined as a cue with an unspecified future end time. The current WHATWG spec represents the duration of an unbounded stream with Infinity: media.duration = Infinity. The stream demonstrably has a finite duration, so Infinity is being used to represent an unspecified future end time and the same definition applies to an unbounded cue - as previously discussed.
This can be illustrated with a simple use case. For example, the score during a live sporting event is "0-0" at the start of the game. This may change to "1-0" or "0-1" at an unknown future time or may stay the same, and the duration of the match may be extended. There is no way of knowing the duration of the "0-0" cue until the score changes or the game ends, so this is an unbounded cue.
The point is that it is not possible to predict the end time of an unbounded cue (or stream) because it is unspecified by definition, which is reflected in the WebVMT cue syntax:
00:30.000 -->
This cue is unbounded
This is a new HTML feature and if there is a requirement for backward compatibility, bounded cues can be used instead to ensure the desired result.
Um, the example you give doesn't seem to work, unless there is some provision to update a cue that's already been issued, decoded and acted on. The score 0-0 will last forever, even after a goal is scored.
Using captions for state is, I think, problematic.
@dwsinger I agree - there has been talk (in the MEIG) of creating a way to update the end time of a cue, but no proposal so far for how to do it. One of the problems is how to identify which cue's end time needs to be updated, given that the uniqueness of any cue identifier is only constrained within a single WebVTT file.
If I understand correctly there has been some consideration of handling unbounded end times in the context of ISOBMFF? I wonder how the group looking at that imagined this working - can you tell us anything more?
Yes it requires an update mechanism which exists in the HTML DOM. I agree that WebVTT would require modification to support this too.
WebVMT is designed to handle this issue for moving object trajectories, regions and sensors. Perhaps there's a way to apply this to text cues too.
In the file format, we're merely addressing how to handle cues that are marked as indefinite; any update of end-time would be something we'd reflect if it were in the base spec.; I don't think we want to invent provisions at the MP4 level, merely encapsulate.
we're merely addressing how to handle cues that are marked as indefinite
I don't quite follow: there is no current way to mark a cue as indefinite in a file, so what is it that needs to be addressed?
I mean, we're considering how to address cues that have an unbounded end time, which is this pull.
I mean, we're considering how to address cues that have an unbounded end time, which is this pull.
I'd be interested in @gkatsev 's views here, but it feels like we'll keep spinning until we have a clearer notion of what the use cases and requirements are, and if there are different groups hoping to get something out of the notion of an unbounded end time, they should probably sit down together in some way and share what their goals are.
My impression was that this was settled and likely to land soon, and since ISO moves really slowly, we should start on the encapsulation work. If that's not the case, let me know. (Though text on "how to handle a wombat invasion" when in fact, there is no provision for wombats at all, is harmless).
I'm more concerned that there could be conceptual differences in the models in our heads for how this might work, so there could indeed be harm - some people might be trying to handle a wombat invasion, and others might be trying to work out how to house wombats in a safe way, and nobody is asking the wombats.
So, I think there are two things here.
(I was actually trying to test whether you can update a VTTCue live and how it updates but running into weird bugs, I'll try again another day)
One solution is to update the end time of an unbounded cue by superseding it with a bounded cue which has matching start time and content.
For example:
NOTE Unbounded cue
00:30.000 -->
This cue lasts until...
NOTE Subsequent cues go here
00:40.000 --> 00:50.000
This is a later cue
NOTE When known, update the unbounded cue end time
00:30.000 --> 01:30.000
This cue lasts until...
NOTE Cues after update time go here
01:30.000 --> 01:35.000
...now.
This would not require an identifier and has the advantage of including a valid bounded cue in the stream at the earliest moment when all its attributes are known - for compatibility. @gkatsev has already confirmed that unbounded cues will be ignored by current WebVTT parsers. WebVTT cue time ordering would need to be relaxed for this exceptional case and could be used as an efficient mechanism for identifying updates.
At FOMS in previous years we discussed how WebVTT should be used for live and segmented vtt files, particularly around cues that span multiple segments. One of the things that came out there is that you could match up cues with the same text and start time across segments and update the end time to that of the last cue that you got, kind of like @rjksmith shows above. This actually also fits well with the unbounded cue with the missing end time thing.
We already have the situation where a cue's end time is "unbounded" at first and is updated later, because some in-band text track formats deliver cues with only a start time. In these formats a cue ends when the next cue or empty edit is delivered.
We support this in WebKit by giving these cues an end time of positive-infinity when they are delivered, and updating the end time when it is known
We already have the situation where a cue's end time is "unbounded" at first and is updated later, because some in-band text track formats deliver cues with only a start time. In these formats a cue ends when the next cue or empty edit is delivered.
That works in a semantic model where only one "cue" can be active at once, but WebVTT's model does not enforce that constraint, so I don't understand how it helps here?
We already have the situation where a cue's end time is "unbounded" at first and is updated later, because some in-band text track formats deliver cues with only a start time. In these formats a cue ends when the next cue or empty edit is delivered.
That works in a semantic model where only one "cue" can be active at once, but WebVTT's model does not enforce that constraint, so I don't understand how it helps here?
I was merely pointing out that cues are being delivered on the web today whose duration is unknown when they are delivered, in case this isn't widely known.
We already have the situation where a cue's end time is "unbounded" at first and is updated later, because some in-band text track formats deliver cues with only a start time. In these formats a cue ends when the next cue or empty edit is delivered.
We support this in WebKit by giving these cues an end time of positive-infinity when they are delivered, and updating the end time when it is known
@eric-carlson Thanks for your constructive feedback. Highlighting another valid use case is helpful and this could also be supported by WebVTT using the syntax proposed above. WebVMT has a similar use case for live streaming.
Highlighting another valid use case is helpful and this could also be supported by WebVTT using the syntax proposed above.
@rjksmith There are a lot of unanswered questions implied by the syntax proposal. First on my list is: given that WebVTT does not prohibit multiple cues with the same begin time, begin time seems to be unsuitable for use as a key to match the cue. How would you resolve this?
So, I think there are two things here.
update VTTCue API to allow Infinity for the endTime. This aligns it with the outstanding PR against TextTrackCue (whatwg/html#5953) and comes from this issue whatwg/html#5297. Based on the original issue, I think this is mostly for request programmatic access and is mostly agreed upon, afaik.
As part of this PR, in the TTWG discussion, the question came up whether we wanted/needed syntax in WebVTT to represent this, and I think this is where a lot of the issues are coming up with. I think it would definitely be valid to decide that WebVTT can't represent unbounded cues, though, given that WebvTT is supposed to be also used for metadata, I think it probably should but can and should be addressed separately from this PR.
@gkatsev I agree.
The Timed Text Working Group just discussed WebVTT - Added unbounded TextTrackCue.endTime w3c/webvtt#493
, and agreed to the following:
SUMMARY: 1. No objections to this PR as is, though we are blocked on tests. 2. Move the unbounded cue syntax question into a separate issue.
Highlighting another valid use case is helpful and this could also be supported by WebVTT using the syntax proposed above.
@rjksmith There are a lot of unanswered questions implied by the syntax proposal. First on my list is: given that WebVTT does not prohibit multiple cues with the same begin time, begin time seems to be unsuitable for use as a key to match the cue. How would you resolve this?
@nigelmegitt I agree that using only start time to match the cue is unsuitable which is why I proposed matching by start time and content as stated previously.
As promised above, I've now raised issue #496 for discussion of use cases and syntax changes separately and suggest we continue this conversation there. Thank you.
I think you might have received an email from W3C non-participant contribution system at Mar 11, 2021, to an email address you registered to W3C account.
@himorin Many thanks. I've found it.
Thanks for your help. I can confirm that I've now signed this.
Oh, just realized that maybe TextTrackCue has tests too and sure enough it does. https://github.com/web-platform-tests/wpt/blob/master/html/semantics/embedded-content/media-elements/interfaces/TextTrackCue/endTime.html I'll add a comment on the html repo PR too.
@gkatsev Having written suitable tests in web-platform-tests/wpt#28394, I noticed that https://github.com/web-platform-tests/wpt/blob/master/html/semantics/embedded-content/media-elements/interfaces/TextTrackCue relies on the VTTCue constructor so whatwg/html#5953 and #493 (this) are interdependent. I don't foresee a problem, but these changes need to be properly co-ordinated because of the Web Platform Tests. Hope this helps.
@himorin I've signed the non-participant agreement but still see a failing check on this issue.
Acceptable: no Contributor: @rjksmith needs to submit their non-participant licensing commitment via the link they received by email.
Please advise if there are further actions I need to take to resolve this. Thanks.
Rebased to main
Excellent. You're welcome and thanks for your guidance @foolip
@gkatsev This PR is now ready for review and merge - as are whatwg/html#5953 and web-platform-tests/wpt#28394. Please co-ordinate with @foolip to synchronise these changes and let me know if there are any problems. Thanks.
Added support for unbounded TextTrackCue - see https://github.com/whatwg/html/pull/5953 Whitespace removed by Atom
Preview | Diff