w3c / ttml3

Timed Text Markup Language 3 (TTML3)
http://w3c.github.io/ttml3/
Other
6 stars 6 forks source link

Remove ttp:mediaOffset. #10

Closed nigelmegitt closed 5 years ago

nigelmegitt commented 7 years ago

See #125 for background. Raising this to capture my thoughts on mediaOffset that I have previously discussed during face to face meetings.

There are multiple problems with ttp:mediaOffset as currently specified:

Problems

Multiple presentation contexts

ttp:mediaOffset appears to require that exactly one related media object is relevant when deciding what media offset to apply at presentation time, in other words it fails if there is more than one presentation context, and different presentation contexts have different offset requirements. In general TTML documents are likely to be reused in different presentation contexts in an "author once, use many times" pattern.

Duplication of functionality present elsewhere

It duplicates functionality more usually put into wrapper formats - if it is only metadata from the TTML context then it should not be in the ttp namespace. It is unclear whether some additional time computation is required at presentation time or if the offset value should simply be transferred to the wrapper format. This ambiguity is likely to cause interoperability problems, especially if the offset is in some implementations effectively processed twice.

Circular definition

In Terminology:

[root temporal extent] The temporal extent (interval) defined by the temporal beginning and ending of a document instance in relationship with some external application or presentation context.

In §7.2.10 ttp:mediaOffset

The ttp:mediaOffset attribute is used to specify the temporal offset between the begin time of the root temporal extent and the begin time of a related media object ...

Expanding these, we seem to have a temporal offset between the beginning of a (temporal extent defined by the temporal beginning and ending of a document instance in relationship with some ... presentation context) and the begin time of a related media object. This doesn't appear to make any sense. If it is defining the relationship between the document instance's begin time and the presentation context, then the above wording in §7.2.10 doesn't say that.

I am taking it that a presentation context includes possible playback of some related media object.

SMPTE timebase application problem

If ttp:mediaOffset applies to smpte timebase then this will usually result in playback failure - it is effectively saying although the author used SMPTE timecode to align the subtitles with the video, that timecode alignment is in fact wrong. The correct solution to this problem is to fix the times in the document instance, not to apply a media offset.

Media timebase application problem

It is a possible interpretation that ttp:mediaOffsetspecifies the relationship between media time and the document's timeline, in which case it cannot apply if timebase is "media", since in that timebase the relationship is already defined.

Handling of negative times is undefined.

ttp:mediaOffset, if interpreted as being something that modifies all the computed times, can result in negative times, whose semantic is undefined.

Does not identify beginning of root temporal extent

ttp:mediaOffset does not actually support the (possibly orthogonal) need to define the beginning of intended presentation, i.e. the entry point in the document instance's timeline, or the point before which any ISDs should be truncated. This is asymmetric with the existing ability to specify the end of presentation, which can be done by putting a dur or end attribute on, say, the body element, or if the intent is for indefinite end, the end time can be left unspecified on some leaf elements. Currently all documents appear to begin at time zero regardless of the intended playback begin point.

Proposed resolution

  1. Remove ttp:mediaOffset.
  2. Add a ttp:temporalBegin (name can be discussed) that does not modify the syncbase of the root element but truncates all ISDs beginning before it so that the content does not appear. This deals with the "pre-roll" scenario for example, and can also safely be carried over to wrapper formats for example to derive fragment or segment times without needing to modify the computed times in the document instance. In other words, if duplicated in a wrapper it remains "safe".
skynavga commented 7 years ago

Regarding multiple presentation contexts, TTML does not recognize the notion of multiple media objects or multiple presentation contexts in relationship to a single TTML document instance. Any such association is outside the scope of TTML.

Regarding duplication of functionality specified elsewhere, TTML does not recognize functionality external to TTML except in the context of a higher level protocol. It is entirely irrelevant whether a wrapper format supports carriage of the same or similar information. TTML has never depended on the existence of an external wrapper, and making that argument here is specious. You could make the exact same argument about ttp:frameRate, but nobody does.

Regarding circular definition, I fail to see any circularity, and I fail to see why you think the current language doesn't make any sense. It makes perfect sense to me. The root temporal interval always is an interval of the form [0s,end], where end is the end time of the document's active interval. However, the value 0s may not coincide with the media time of 0s of a related media object. For example, consider a document where the first (temporally speaking) ISD is assigned the temporal interval [36000s,36002s], i.e., from 10:00:00 to 10:00:02, and, further, that the document's use of 10:00:00 is intended to correspond with media time 0s of a related media object. In this case, there is a 36000s or 10h offset between the media object's time interval and the timed text time interval. In order to express this very important relationship the document would specify ttp:mediaOffset="-36000s" (or equivalent) parameter, and a presentation processor would add this offset to all ISD times in order to synchronize with the related media object's media time.

Note that this parameter is already described (but not defined) in TTML1 N.2:

Note:

The above formalisms assumes that the Root Temporal Extent corresponds with the beginning of a related media object. If this assumption doesn't hold, then an additional offset that accounts for the difference may be introduced when computing media time M.

Regarding SMPTE timebase application problem, this "problem" only appears when one is using SMPTE discontinuous mode, so I agree this mode would be excluded; however, it still applies to SMPTE continuous mode, which is not based on label matching, but is isomorphic with media time base.

Regarding handling of negative times is undefined, I agree that some language is needed to handle this case effectively. Probably something to the effect "after applying the media offset, a negative time value is set to 0s", thus if the entire offsetted interval is on the negative side of the origin, then it ends up being [0s,0s), which is effectively empty. And if the offsetted interval only partially intersects with the origin, e.g., [-5s,5s), then it would be clipped to [0s,5s).

Regarding does not identify beginning of root temporal extent, as I point out above, the value of the beginning time of the root temporal interval is always 0s. However, by itself, this time has no defined relationship with a related media object (or any other external temporal interval). The purpose of ttp:mediaOffset is precisely to provide an author an ability to specify this relationship, which answers the question implied by the above language cited from TTML1 (about an additional offset). As such, this parameter indirectly answers how to

define the beginning of intended presentation, i.e. the entry point in the document instance's timeline, or the point before which any ISDs should be truncated

which you, for some unknown reason, don't seem to think it does, even though I keep insisting that this is exactly what it does.

Finally, nothing about ttp:mediaOffset implies a need for an external processor to modify any time expression in a TTML document in the case that the offset changes. For example, if a related media object has additional leader content pre-pended to it, or an existing leader is removed, then only the value of ttp:mediaOffset on the <tt> element would be modified, and that wouldn't be needed if a higher level protocol applies so as to use an externally specified offset, e.g., one that comes from a wrapper that overrides a internally specified offset.

nigelmegitt commented 7 years ago

Thanks for the detailed responses @skynavga . It seems we have partial agreement but significant misalignment of understanding still. I'll take one more pass at explaining where I still disagree; then I guess we should discuss in a conversation rather than here, if I still have not convinced you.

Regarding multiple presentation contexts, TTML does not recognize the notion of multiple media objects or multiple presentation contexts in relationship to a single TTML document instance.

Indeed! However it does not constrain the number of related presentation contexts at all. So we should not introduce a new thing that limits the maximum number of presentation contexts.

Regarding duplication of functionality specified elsewhere, TTML does not recognize functionality external to TTML ... TTML has never depended on the existence of an external wrapper, and making that argument here is specious. You could make the exact same argument about ttp:frameRate, but nobody does.

No, there's a fundamental qualititative difference here between ttp:frameRate and ttp:mediaOffset. The frame rate is describing the correct interpretation of time expressions within the document; there is no requirement in general that the frame rate matches some related media, though in the specific case of matching smpte time code it is likely to cause problems if they don't match. However ttp:mediaOffset is attempting to relate the time expressions directly to a particular (anonymous) related media object.

Since we know that this exact same functionality is present elsewhere (e.g. in ISOBMFF) and that duplicating it will likely cause misunderstanding, as a Group we need to recognise that and not create a problem that does not need to exist. More on this at the end of this post.

Regarding circular definition ...

The definition of media time is that 0s is the beginning of the media's timing reference and the playrates match up. If you have media with associated timings that are used as an alternative reference and the TTML time expressions for any given moment in the media timeline do not match them, then that is not media timebase.

The example you provide is essentially a smpte timebase example. If you apply an offset to the TTML document instance's time expressions, then they will no longer match up, and presentation will be broken. This is a bad thing.

You don't seem to believe me that this is not only a discontinuous markerMode issue - in reality in smpte timebase exactly the same approach is likely to be used to match media time code to TTML time values regardless of whether the markerMode is continuous or discontinuous. The distinction is a nicety of TTML chiefly, albeit that the computed times may be different in the two cases if there are nested timed elements.

I fail to see why you think the current language doesn't make any sense. It makes perfect sense to me.

Well I'm providing one data point that the current language is very hard to understand. Since it makes perfect sense to you, perhaps you could try to rewrite it in simpler terms? To be more specific about my issue with it, here's a question I cannot answer reading the current text: is the "relationship with some presentation context" being defined by ttp:mediaOffset or is ttp:mediaOffset additional to that relationship?

Regarding does not identify beginning of root temporal extent, as I point out above, the value of the beginning time of the root temporal interval is always 0s. However, by itself, this time has no defined relationship with a related media object (or any other external temporal interval).

Yes, it does have a relationship - it means that the origin coordinates of the media time and the TTML document instance's time coordinates are identical. If this is ambiguous now, my preference would be to clarify that point in the spec.

The reason I don't think that ttp:mediaOffset defines the entry point is that it is instead a modifier of the time expressions. Even if you don't accept that the origin coordinates of the media time and the document instance's time coordinates are identical in general, if in a specific case they happen to be identical, i.e. ttp:mediaOffset="0s" as currently defined, you may still want to say "this document defines content beginning at 1234567s", so it does not define an entry point in that case. As I suggested, perhaps they are orthogonal concepts. The entry point would not offset the relationship between the document instance's time expressions and the related media's times.

nothing about ttp:mediaOffset implies a need for an external processor to modify any time expression in a TTML document in the case that the offset changes

I'm not sure which of my points this is supposed to address - I never suggested otherwise; however the spec as I currently read it does require that the offset is applied as an additional computation step when a processor is processing time expressions in the document.

Since the clock timebase is already excluded, and I'm asserting that mediaOffset only creates a problem in smpte timebase, and that it is unnecessary in media timebase since the relationship is already defined, all the scenarios in which ttp:mediaOffset might apply have been removed.

For example, if a related media object has additional leader content pre-pended to it, or an existing leader is removed, then only the value of ttp:mediaOffset on the <tt> element would be modified, and that wouldn't be needed if a higher level protocol applies so as to use an externally specified offset, e.g., one that comes from a wrapper that overrides a internally specified offset.

There is no text in the current ED to say that a higher level protocol may be used to override it; such an addition may help to address my concern about wrapper formats. The question will arise otherwise: is ttp:mediaOffset the same as a presentation time offset in, say, DASH, or additional to it, or something different? We could be explicit here, or, my preference, remove ttp:mediaOffset altogether.

The fact that TTML doesn't recognise external functionality doesn't mean that as a group managing the spec we don't recognise it. There is a context to our work.

Removal of the leader content should be done in my view by specifying the entry point, before which ISDs should be truncated.

By the way, seemingly to argue against myself, there is a practical use case for wanting to apply some form of offset operationally: we often find that subtitle files are somewhat misaligned against the media, i.e. the requirement that the origin coordinates for the media and for the document instance is aligned is not satisfied. In that case we have a process that modifies the time expressions in the document to apply the required offset.

In my (purist?) view modifying the time expressions in the document is more robust and also more computationally efficient since the additional processing is done once rather than requiring it to be duplicated by all the presentation processors. It also means that there is no backward compatibility problem with legacy processors interpreting the times differently to newer ones, should those newer ones handle a media offset via a provided parameter attribute.

nigelmegitt commented 7 years ago

I'm aware that this issue remains open and we don't have a proposed resolution, however as things stand I'm objecting to ttp:mediaOffset in its current form for reasons described at length above. However I seem to be the only vocal person who is unhappy with this, so I'm willing to allow it through for wide review and see if anyone else shares my concerns, as long as we also add the orthogonal begin clipping functionality - calling it ttp:clipBegin would match what we have done in the audio element quite neatly; for symmetry we could add ttp:clipEnd also. I will propose this in a pull request.

palemieux commented 7 years ago

@nigelmegitt I have objected to this feature before.

skynavga commented 7 years ago

There is a way to deal with features you don't want to use:

  1. don't use it
  2. prohibit it in a profile

Just because you don't find it useful doesn't mean others won't. Are you going to object to every other feature you don't want to use?

I simply don't recognize the legitimacy of an argument that this information must be transported externally to a document. That is an extremely silly argument. TTML does not recognize and cannot depend on external usage behavior.

On Tue, Jun 27, 2017 at 8:33 AM, Pierre-Anthony Lemieux < notifications@github.com> wrote:

@nigelmegitt https://github.com/nigelmegitt I have objected to this feature before https://github.com/w3c/ttml2/issues/125#issuecomment-301981162.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/w3c/ttml2/issues/323#issuecomment-311377442, or mute the thread https://github.com/notifications/unsubscribe-auth/AAXCb7-BAnc5K_0598lePNIC8WojmDsBks5sIRLGgaJpZM4NgitA .

nigelmegitt commented 7 years ago

I think the point here @skynavga is that there are several members who find the mere presence of ttp:mediaOffset actively dangerous.

skynavga commented 7 years ago

That's absurd.

On Tue, Jun 27, 2017 at 11:11 AM, Nigel Megitt notifications@github.com wrote:

I think the point here @skynavga https://github.com/skynavga is that there are several members who find the mere presence of ttp:mediaOffset actively dangerous.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/w3c/ttml2/issues/323#issuecomment-311423716, or mute the thread https://github.com/notifications/unsubscribe-auth/AAXCb63PR0APAJDs6u0vJtKqIeCLh9_7ks5sITeygaJpZM4NgitA .

nigelmegitt commented 7 years ago

Regardless of how absurd you may think it is @skynavga the fact is that we have at least 2 members of the group objecting, so from a process perspective alone we need to address this.

So far I have probably been the most vocal in explaining my reasons for objecting, and I've not seen anything yet that addresses those reasons, for example a more clearly defined semantic, a use case, text that addresses the identified operational problems the feature is likely to introduce etc., or of course removal of the feature.

By the way, it cannot currently be prohibited in a profile because there is no feature designator that corresponds with ttp:mediaOffset support.

skynavga commented 7 years ago

On Wed, Jun 28, 2017 at 2:23 AM, Nigel Megitt notifications@github.com wrote:

Regardless of how absurd you may think it is @skynavga https://github.com/skynavga the fact is that we have at least 2 members of the group objecting, so from a process perspective alone we need to address this.

So far I have probably been the most vocal in explaining my reasons for objecting, and I've not seen anything yet that addresses those reasons, for example a more clearly defined semantic, a use case, text that addresses the identified operational problems the feature is likely to introduce etc., or of course removal of the feature.

I have no idea why anyone would think it is dangerous. What is your rationale? Please contrast such a rationale with ttp:frameRate.

By the way, it cannot currently be prohibited in a profile because there is no feature designator that corresponds with ttp:mediaOffset support.

That can be easily remedied.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/w3c/ttml2/issues/323#issuecomment-311591604, or mute the thread https://github.com/notifications/unsubscribe-auth/AAXCb3COe3331o3EGugdSMKBW1zxHeuqks5sIg2NgaJpZM4NgitA .

nigelmegitt commented 7 years ago

I have no idea why anyone would think it is dangerous. What is your rationale? Please contrast such a rationale with ttp:frameRate.

You're asking me to repeat things that are detailed at length in this issue already, for example at https://github.com/w3c/ttml2/issues/323#issuecomment-303989905 where I point out that ttp:frameRate relates to the time expressions within the document instance, and are arithmetically independent of the frame rate of any related media, whereas ttp:mediaOffset is expressing a relationship between the document timeline and the timeline of one or possibly more than one unspecified and unspecifiable related media objects. The data model here is just wrong for ttp:mediaOffset.

skynavga commented 7 years ago

On Wed, Jun 28, 2017 at 4:40 AM, Nigel Megitt notifications@github.com wrote:

I have no idea why anyone would think it is dangerous. What is your rationale? Please contrast such a rationale with ttp:frameRate.

You're asking me to repeat things that are detailed at length in this issue already, for example at #323 (comment) https://github.com/w3c/ttml2/issues/323#issuecomment-303989905 where I point out that ttp:frameRate relates to the time expressions within the document instance, and are arithmetically independent of the frame rate of any related media, whereas ttp:mediaOffset is expressing a relationship between the document timeline and the timeline of one or possibly more than one unspecified and unspecifiable related media objects. The data model here is just wrong for ttp:mediaOffset.

I am operating under the premise that for media and continuous smpte mode time bases, the following hold:

  1. there is a known related media object, and the document was explicitly authored based on knowledge of the timeline of this related media object;
  2. at present (in TTML1), there is no defined relationship between a document timeline and a related media objects timeline, i.e., a time expression 10:00:00:00 appearing in a document coud mean the start of the related media object timeline, the end of the related media object timeline, or any arbitrary time point before the start of the related media object timeline, during the related media object timeline, or after the related media object timeline;
  3. TTML1 recognizes the existence of this ambiguity by referring to an offset that would define such a relationship, but it does not specify it in detail, and does not specify a particular use of the offset;
  4. an author may wish to specify this offset and have it retained in the document so as to capture authorial intentions;
  5. an author may wish a processor to make use of this offset in the absence of external wrapper information or another form of externally provided parameter data;
  6. TTML1 currently defines a number of parameters that perform a similar function, all of which relate document timeline semantics to related media object timeline semantics:
    • ttp:frameRate
    • ttp:frameRateMultiplier
    • ttp:subFrameRate
    • ttp:tickRate

There is nothing wrong about a data model the intent of which is to give specificity to otherwise unexpressed parameters. There is nothing dangerous in this model that is not already present with respect to any of the above parameters, which you should be asking to be removed in accordance with the logic you are using to deny legitimacy to ttp:mediaOffset.

At the bottom of it all, you appear to accept the existence of the fundamental ambiguity described above, and wish to actively prevent resolving that ambiguity. You also are assuming the existence of a wrapper model that is external to TTML and wish to assign that external mechanism the sole responsibility for defining the time relationship between the two timelines.

All in all, your logic makes no sense to me.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/w3c/ttml2/issues/323#issuecomment-311623277, or mute the thread https://github.com/notifications/unsubscribe-auth/AAXCb8nMUNaKxPWCGK9sVqW0aLi_CRElks5sIi2EgaJpZM4NgitA .

nigelmegitt commented 7 years ago

Addressing the points where we have not yet converged:

  1. at present (in TTML1), there is no defined relationship between a document timeline and a related media objects timeline

There is indeed a defined relationship: the origin of the media timeline in the document is identical to the origin of the related media object's timeline.

  1. TTML1 recognizes the existence of this ambiguity by referring to an offset that would define such a relationship, but it does not specify it in detail, and does not specify a particular use of the offset;

I think you're referring to the final note in TTML1 §N.2:

If this assumption doesn't hold, then an additional offset that accounts for the difference may be introduced when computing media time M.

Knowledge of whether or not this assumption holds is in the document processing context. By specifying the offset we are actually introducing a new ambiguity between the document instance and the document processing context. If you want to say something more than the current note in TTML1 then the change we should introduce is to clarify that the document processing context must apply any additional offset, or the document should be modified to re-base the times such that the offset becomes zero once again.

  1. an author may wish to specify this offset and have it retained in the document so as to capture authorial intentions;

This allows for the author to arbitrarily choose a different basis for media timing compared to the media available at authoring time. This is anarchy. It is defining a new timebase that is not any of the existing timebases. It is most closely related to smpte except without the expectation of using time expressions that have frames components. It might be acceptable for me if we add a new value for ttp:timebase called offsetMedia or something similar that incorporates this offset.

Consider a TTML1 processor that quietly ignores ttp namespace attributes it does not support. Now we create a TTML2 document with ttp:timebase="media" and a non-zero ttp:mediaOffset value, that is otherwise a usable TTML1 document, perhaps authored that way for backwards compatibility. Now the offset is especially dangerous since our TTML1 processor will get the times all wrong, and calculate them differently from a TTML2 processor.

  1. an author may wish a processor to make use of this offset in the absence of external wrapper information or another form of externally provided parameter data;

And what is the semantic if both are present? Is the intent that the ttp:mediaOffset is transferred to the external wrapper or that both are used, additively? There's not enough detail about this use case in the current draft of the spec. As it stands, it must be an additive rule.

  1. TTML1 currently defines a number of parameters that perform a similar function, all of which relate document timeline semantics to related media object timeline semantics: [etc]

For all of [ttp:frameRate, ttp:frameRateMultiplier, ttp:subFrameRate and ttp:tickRate] the values are defined completely by the document instance even if there is no external media or application defined value. The fact that an author may choose to make them coincident with the related media is a convenience, but the document time expressions can be evaluated and are well defined regardless of whether or not those attribute values match the equivalents in the related media. If the application specifies them externally, that's fine too since only one value can apply. In the case of media offset, it is a value that can be replaced or added and the semantic is unclear, or most likely it is that the values are added.

you appear to accept the existence of the fundamental ambiguity described above, and wish to actively prevent resolving that ambiguity.

Not quite - rather, I don't want us to introduce a further ambiguity, and that is how I read the current draft specification text. But more fundamentally I think the ambiguity cannot be resolved only within a document instance, and must be resolved by an external process, certainly with ttp:timeBase="media". This is the document processing context, regardless of what form it takes, be it a wrapper or something else.

nigelmegitt commented 7 years ago

I split out the temporal clipping issue into #483 so we can continue with this issue being only about the specification of media offset, and what that should/shouldn't/does/doesn't mean.

skynavga commented 6 years ago

Based on the result recorded in #125, retargeting this issue to actually remove the ttp:mediaOffset parameter attribute, restoring substance of original note at end of I.2 (TTML1 N.2).

skynavga commented 5 years ago

Since ttp:mediaOffset has already been removed from TTML2 (https://github.com/w3c/ttml2/commit/9211107b31ae72382b7318b90d4452790ee97e26), no issue remains for further processing. If such an issue should exist, then I suggest a new issue be filed here (on TTML3) or on TTML2 as appropriate. Accordingly, I am closing this issue with no further action.