w3c / ttml2

Timed Text Markup Language 2 (TTML2)
https://w3c.github.io/ttml2/
Other
40 stars 16 forks source link

Clarify if the first ISD must/may be constructed when empty #1232

Closed nigelmegitt closed 2 years ago

nigelmegitt commented 3 years ago

This issue arises because there's discussion in MPEG regarding 14496-30 about defining the correct semantics of processing sequences of documents each in a wrapper, where the draft text explains it in terms of sequences of ISDs. @cconcolato @mikedo

The specification for [resolve timing] step 2 is:

divide the active time duration of the current document instance into an ordered sequence of time coordinates {T0, T1, T2, ...} where, at each time coordinate Ti, some element becomes temporally active or inactive.

The question I have is: in the case that the first element that becomes temporally active does so at a time T1 where T1 > 0, does this imply or require that there is a first empty ISD that begins at time 0 and ends at T1?

Taking it literally, no element becomes temporally active or inactive at time 0, so arguably no ISD should be generated. However, the draft text I've had sight of implies an expectation that there is such an "empty document", and it is possible that implementers would like to be able to rely on this behaviour.

I'd like to discover what our expectation and understanding is here, and if there is implementation experience to guide us, so that we can state this more clearly in the specification, and feed this information to others who may depend on it.

mikedo commented 3 years ago

The expectation is that the normal behavior is that all samples (documents) will have a T0 ISD to recreate the display state of the last ISD in the previous sample. This is required for the requirement that every sample (or at least segment) is a RAP. The exception would be if the last ISD in the previous sample had no active text or region at the end of the sample.

There are no sparse tracks. Every sample on the track timeline must be present and contain a valid document, Such a document would be the empty document only if there was no active text or regions for the entire sample.

This is trickier and probably needs more thought for the handling of indefinite end times.

nigelmegitt commented 3 years ago

Thanks @mikedo that is interesting but I don't think it answers the question. We can get closer by narrowing down the scenario you describe in either of two ways:

  1. Consider the case where the last ISD in the previous sample is indeed "empty" and the first content to be displayed in the current sample appears some time later than the beginning of the sample OR
  2. The first content to be displayed in the first sample appears some time later than the beginning of the sample.

In both of these cases the TTML processor must produce a first "non-empty" ISD1 that begins at a time greater than zero. But should it also produce an "empty" ISD0 from time zero until the begin time of ISD1? Does it matter to implementers?

palemieux commented 3 years ago

in the case that the first element that becomes temporally active does so at a time T1 where T1 > 0, does this imply or require that there is a first empty ISD that begins at time 0 and ends at T1?

Yes. There is always an ISD at t = 0.

palemieux commented 3 years ago

Similarly, there is always a last ISD whose end time is indefinite.

nigelmegitt commented 3 years ago

Yes. There is always an ISD at t = 0.

That's strong certainty @palemieux ! How did you come to this conclusion?

nigelmegitt commented 3 years ago

By the way, I would be in favour of a clarification that says "always create an ISD at t = 0". I don't want to do so if there are implementations that do not do this now, and would need to be changed. If there are such implementations, then I would add a clarification that says "creating an ISD at t = 0 if there are no active elements then is optional".

I'm not very comfortable with the current state where possibly different readers come to different conclusions.

palemieux commented 3 years ago

From a practical perspective, it is simpler and safer for an ISD to always be defined at t = 0, even if it is empty.

There are otherwise many ambiguities and corner cases to consider, e.g.:

nigelmegitt commented 3 years ago

can_of_worms.open()

Those may be related questions, but I think this is a much simpler and more practical question, without worrying about them. The point is: for any given time t is there always exactly one ISD where t falls within that ISD's interval? It may be that the ISD is "empty" of course.

I think this is about avoiding an "if" statement, when it comes down to it, to test for the existence of an ISD, and handling the case where there is none. Maybe all implementers already do this, or maybe they all expect an ISD always.

palemieux commented 3 years ago

for any given time t is there always exactly one ISD where t falls within that ISD's interval? It may be that the ISD is "empty" of course.

Exactly :)

skynavga commented 3 years ago

It is definitely NOT the case that TTML ISD generation must or may produce an "empty ISD instance" at T=0. I have no idea how PAL came to his conclusion. Certainly there is nothing in TTML to suggest this is the case. On the other hand, I see this question as an exercise in how the ISD sequence produced by the currently specified TTML algorithm can or may be used by downstream applications. As far as that goes, I see no reason that one could not produce a particular option in a TTML Presentation Processor implementation that does indeed produce such an empty ISD at the starting line. At the same time, I don't see any expectation on the part of TTML that all applications should do this or even expect it done.

palemieux commented 3 years ago

@skynavga To be clear, the specification does not impose any requirement for an implementation to produce any ISD. However, as stated above, it is simpler and safer for an ISD to always be defined at t = 0, even if it is empty, or, as @nigelmegitt , writes The point is: for any given time t is there always exactly one ISD where t falls within that ISD's interval?

skynavga commented 3 years ago

@palemieux The current algorithm defined by [construct intermediate document] assigns T0 to the time coordinate associated with the start of the first active interval in the document; if T0 > 0, then, as defined, that algorithm does not produce an intermediate document for the interval [0,T0).

Furthermore for intervals [Ti,Ti+1), the current algorithm elides (in steps 4,5) the body element of that interval, both the temporary body created in step 2 and the original source document's body, if the temporary (replicated) body is empty.

So, there is nothing in the current TTML specification that implies an intermediate document on the interval [0,T0). Notwithstanding this fact, the only application of this output defined within the specification is a TTML Presentation Processor, which, among other things, must implement some realization of Synchronic Flow Processing, but it makes no assumption about the existence of inactive intervals (in general). The TTML specification also defines one concrete form for this output, namely the Intermediate Document Syntax (ISD), but does not define any application of this syntax.

My conclusion is that, if there is some other (externally defined) application specification that makes use of output of [construct intermediate document], then that specification may define such usage in a manner that requires its application specific processors to insert an empty intermediate document for the interval [0,T0). This would fall under the definition of a higher level protocol.

nigelmegitt commented 3 years ago

Thanks all for the discussion on this. I think we may be able to reach a conclusion.

Reading the comments and responses it seems to me that we have agreement that:

Have I got any of those wrong?

I had to read the spec very carefully to come to these conclusions, so I suspect it would be helpful to others to add an informative note to the [resolve timing] algorithm explaining these consequences. I'm happy to draft one, or for someone else to.

I am not sure if it would be helpful or not also to raise an issue against IMSC asking it to require a particular behaviour here. I don't think I've seen enough implementer response to come to a conclusion on that yet.

skynavga commented 3 years ago
  • TTML does require a final ISD that has no end time, be that empty or not.

How do you reach this conclusion? The specification, for example, ties the active duration of body to the duration of the root temporal extent, which may be definite or indefinite. If definite, then body has a definite end time, and the same applies for regions, head, etc. Since the root temporal extent is defined by the document processing context, it is application specific as to whether the last ISD has indefinite duration. In the case of TTPE, the user supplies an external time extent, which may be definite or not, and that is used to bound the root temporal extent. So I would not agree with your 3rd bullet above.

nigelmegitt commented 3 years ago

How do you reach this conclusion?

Thanks for making me look harder at this. I did miss the possibility that all of the elements in the document that can be timed have a definite end time. In that case, I can see that a final empty ISD might not be generated.

Looking at the [resolve timing] procedure, as I quoted at https://github.com/w3c/ttml2/issues/1232#issue-891763373, it relies on a thing called "active time duration" which isn't actually defined in the document; this is the only instance where that phrase appears. The active time duration is that of the document instance, not the body, so presumably allows for regions whose intervals extend beyond the body's interval.

I have not found anything in the spec that defines "active time duration of the current document instance" in relation to the root temporal extent, though reading the definition of the latter, it looks like they may well be referring to the same concept. This could be worth tidying up?

Amending my bullets from https://github.com/w3c/ttml2/issues/1232#issuecomment-842143323 :

skynavga commented 3 years ago

@nigelmegitt I now concur with your bullets above; and, yes, we can change "active time duration of the current document instance" to read "active time duration, i.e., root temporal extent, of the current document instance"; the current language was from a time prior to our introducing the term "root temporal extent".

css-meeting-bot commented 3 years ago

The Timed Text Working Group just discussed Clarify if the first ISD must/may be constructed when empty w3c/ttml2#1232, and agreed to the following:

The full IRC log of that discussion <nigel> Topic: Clarify if the first ISD must/may be constructed when empty w3c/ttml2#1232
<nigel> github: https://github.com/w3c/ttml2/issues/1232
<nigel> Nigel: This has had a pull request open for >2 weeks, and we do not have consensus to merge it.
<nigel> .. In fact there are objections to merging the pull request, despite it saying what I thought had been agreed in the issue.
<nigel> .. It seems that the best action here is to close the PR and the issue, marking as "not doing".
<nigel> .. The motivation was to try to help other downstream groups. We agreed there is flexibility in TTML2.
<nigel> Pierre: It could be worth text that says "it is always possible to create an ISD everywhere, it just might be an empty ISD"
<nigel> .. It's a useful observation because then it removes some difficult to define concepts such as root temporal extent.
<nigel> .. You just run a procedure and then you always get something.
<nigel> .. That's one useful outcome from the thread.
<nigel> .. It's particularly useful when you put a TTML document on a timelines,
<nigel> .. where that timeline could start before the begin on a body element for instance and could
<nigel> .. end after the end time on the body element.
<nigel> s/timelines/timeline
<nigel> Cyril: You mentioned MPEG. It would be good for MPEG to have the text you proposed but not strictly necessary.
<nigel> .. As long as we all agree, then we're good.
<nigel> .. I like the text you suggested.
<nigel> .. The only unclear part is what Pierre mentioned about the root temporal extent.
<nigel> .. The rest is uncontroversial.
<nigel> .. What's missing is the definition of an empty document.
<nigel> .. There's some convergence on the empty TTML document defined by EBU.
<nigel> .. But there's no definition of an empty ISD, is there?
<nigel> Nigel: No, I don't think so.
<nigel> .. I get the sense that no change is required but some change is helpful.
<nigel> .. Should we continue the discussion and working on the pull request?
<nigel> Pierre: My issue with the current pull request is that it suggests that there is a correct ISD sequence.
<nigel> .. I think everyone can agree that there for T between 0 and infinity there is always an ISD.
<nigel> Cyril: The wording proposed by Nigel is good, that defers to application.
<nigel> .. I think the mention of root temporal extent was the problem for Glenn.
<nigel> Pierre: That's my objection too.
<nigel> Cyril: Maybe if we just remove that part.
<nigel> Pierre: You can just say that for all time there is an ISD, rather than depending on an unclear begin and end, which don't matter.
<nigel> Nigel: Trying to understand, so you want to say that a sequence of ISDs can be created from any TTML document such that
<nigel> .. there is always an ISD for every positive time T, but that not all applications need to make that whole sequence.
<nigel> Pierre: But the spec should not say "the ISD before some start time is undefined" : it's just empty.
<nigel> Cyril: It's worth giving this a shot, understanding the objections better, now that we understand the objections from Pierre better.
<nigel> Nigel: Okay, thank you, I'll continue to put effort into this.
<nigel> SUMMARY: Nigel @nigelmegitt to attempt to resolve objections to the current PR text.
css-meeting-bot commented 3 years ago

The Timed Text Working Group just discussed Clarify if the first ISD must/may be constructed when empty w3c/ttml2#1232, and agreed to the following:

The full IRC log of that discussion <nigel> Topic: Clarify if the first ISD must/may be constructed when empty w3c/ttml2#1232
<nigel> github: https://github.com/w3c/ttml2/issues/1232
<nigel> Glenn: I added a comment to the PR
<nigel> -> https://github.com/w3c/ttml2/pull/1233#discussion_r650411506 Comment
<nigel> .. pointing out that there is already text in the TTML element that makes the equivalence between
<nigel> .. active document interval and root temporal extent. We already have established that,
<nigel> .. it is just that this particular instance in this procedure should have the consistent language.
<nigel> .. It is not introducing anything new or different in my opinion.
<nigel> .. I'd like to see that move forward.
<nigel> Nigel: Am I correct that you're not happy with that Pierre?
<nigel> Pierre: If we are going to make that change we should rationalise the terms across the document
<nigel> .. and really get to the bottom of what the term root temporal extent means.
<nigel> .. I don't think we should make this change piecemeal.
<nigel> Glenn: I think this started because the wording "active document duration" appears and it is the only place where it appears
<nigel> .. exactly like that. The intent here is simply to resolve that one issue.
<nigel> .. It is clear that's what is meant here.
<nigel> Pierre: I don't think it is clear.
<nigel> .. The term that has been used has been duration, now we're replacing it with extent.
<nigel> .. I would like to know what root temporal extent means.
<nigel> Glenn: That boat has sailed.
<nigel> Pierre: I don't know, it's been ambiguous and we should say what it does.
<nigel> .. It is not defined in the document, we're trying to clarify it.
<nigel> Glenn: Root temporal extent is defined as a term.
<nigel> Pierre: It is a circular definition. If we're clarifying it, we should say what it means or does.
<nigel> Glenn: The intent of this change is not to modify the define root temporal extent.
<nigel> Pierre: It actually changes the interpretation though.
<nigel> .. My situation is to go back and rationalise what root temporal extent means.
<nigel> .. We should not make piecemeal changes.
<nigel> Glenn: I find that quite interesting and wouldn't discourage anyone from undertaking such a project.
<nigel> .. This particular issue is not predicated on reviewing the definition of root temporal extent.
<nigel> .. If you think it is true I would like to see the argument.
<nigel> Nigel: This has been discussed before. It would be good to explain why this procedure depends on the term
<nigel> .. root temporal extent and defines it, which is circular.
<nigel> Pierre: The XXXXScribeMissedXXXX
<nigel> Glenn: The root temporal extent is defined by the document processing context.
<nigel> Pierre: It's never clear to me how there can be an implicit duration but no implicit begin and end.
<nigel> Glenn: This goes back to the semantics of SMIL which make use of the term implicit duration in a highly technical manner.
<nigel> .. We have used that definition in the context of TTML.
<nigel> .. SMIL does not (I don't recall) define an implicit begin or end and we did not do that.
<nigel> .. That sounds like a new work item/requirement that is not on our docket right now.
<nigel> .. I think it is inappropriate to slip it into this PR - it may be an interesting question and possibly elaborate that
<nigel> .. more in the definition of root temporal extent. But it is clear in the current language that we have
<nigel> .. an equivalence statement in the specification of the tt element, so what this change proposes is simply
<nigel> .. to make that usage consistent within the document because we had a case in the timing
<nigel> .. semantics that did not define that properly.
<nigel> Pierre: By the way SMIL does define implicit end and implicit begin.
<nigel> Glenn: Thank you
<nigel> Pierre: Do they apply here?
<nigel> Glenn: That's outside the scope of this PR in my opinion.
<nigel> Pierre: That's my point, if we are tweaking or capturing the original intent of root temporal extent then we have
<nigel> .. to get to the bottom of this.
<nigel> .. My interest here is that there has been confusion here about what the active duration
<nigel> .. of a TTML document is, if you try to render a document outside its active duration.
<nigel> Glenn: Durations have a fixed usage in TTML and SMIL that is independent of the begin and end points.
<nigel> .. If you can resolve the begin and end then the difference is the active duration.
<nigel> .. I still fail to see how you can interpret the current PR as an attempt to redefine the root temporal extent,
<nigel> .. especially as we already have the statement that makes that equivalence.
<nigel> .. If the phrases are different from the intended meaning in resolve timing, then I don't know what else it could be.
<nigel> .. "Active time duration" sounds like a shorthand for that tt element definition.
<nigel> .. So this change seems to make this more consistent rather than less so.
<nigel> Nigel: By the way that is my position as well.
<nigel> Glenn: If you think this is redefining root temporal extent I would like to see the argument for that.
<nigel> .. It is not the intent, and if it were true then we would have to revisit the language in the tt element as well, which is
<nigel> .. not in the scope of this issue.
<nigel> .. I have no objection to revisiting and trying to fine tune the use of the term root temporal extent.
<nigel> Nigel: Thank you, we're running out of time. Anyone else have anything to add on this?
<nigel> .. [no] - we need to work out a way to resolve.
<nigel> .. I brought this to the group to try to work out how to get to consensus on the PR.
<nigel> Pierre: Maybe we're closer than you think - remove the note, and take the "i.e." out, but ultimately the
<nigel> .. root temporal extent is application specified.
<nigel> Nigel: Thank you, please could you comment on the pull request so we can end the call?
<nigel> Pierre: Happy to stay on and discuss further if you have time.
<nigel> SUMMARY: Nigel and Pierre to continue discussions.