w3c / imsc-hrm

IMSC Hypothetical Render Model
https://w3c.github.io/imsc-hrm/spec/imsc-hrm.html
Other
1 stars 6 forks source link

Prepare for requesting Horizontal Review #12

Closed nigelmegitt closed 1 year ago

nigelmegitt commented 2 years ago

Do the checklist at https://www.w3.org/Guide/documentreview/ :

nigelmegitt commented 2 years ago

Accessibility

None of the questions in the accessibility questionnaire apply to the IMSC-HRM specification. For all of the "If ..." rows in the table, the answer is that that they do not apply.

Looking at the related Media Accessibility Checklist linked from the accessibility questionnaire, I think we can say something like:

nigelmegitt commented 2 years ago

Internationalisation

Short i18n review checklist is here

  1. [ ] If the spec (or its implementation) contains any natural language text that will be read by a human (this includes error messages or other UI text, JSON strings, etc, etc),

    Does not apply.

  2. [ ] If the spec (or its implementation) allows content authors to produce typographically appealing text, either in its own right, or in association with graphics.

    The purpose of the IMSC-HRM specification is to allow subtitle and caption authors and providers to verify that the content they provide does not exceed defined complexity levels, so that playback systems can render the content synchronised with the author-specified display times. In order to define complexity levels, the specification makes assumptions on the rendering complexity of various scripts.

  3. [ ] If the spec (or its implementation) allows the user to point into text, creates text fragments, concatenates text, allows the user to select or step through text (using a cursor or other methods), etc.

    Does not apply.

  4. [ ] If the spec (or its implementation) allows searching or matching of text, including syntax and identifiers

    Does not apply.

  5. [ ] If the spec (or its implementation) sorts text

    Does not apply.

  6. [ ] If the spec (or its implementation) captures user input

    Does not apply.

  7. [ ] If the spec (or its implementation) deals with time in any way that will be read by humans and/or crosses time zone boundaries

    Does not apply. The spec does deal with times, as specified via the IMSC and TTML specifications, but no presentation of times is dealt with within the IMSC-HRM specification itself.

  8. [ ] If the spec (or its implementation) allows any character encoding other than UTF-8.

    Does not apply.

  9. [ ] If the spec (or its implementation) defines markup.

    Does not apply.

  10. [ ] If the spec (or its implementation) deals with names, addresses, time & date formats, etc

    Does not apply.

  11. [ ] If the spec (or its implementation) describes a format or data that is likely to need localization.

    Does not apply. The IMSC and TTML specifications define a format that is suitable for use in the activity of localization, but no specific requirements relating to localization exist within the IMSC-HRM specification.

  12. [ ] If the spec (or its implementation) makes any reference to or relies on any cultural norms

    Does not apply.

nigelmegitt commented 2 years ago

Privacy and Security

This specification has no inherent security or privacy implications.

The algorithm defined within this specification is used for static analysis of a resource. This specification does not define any protocol or interface for obtaining such a resource, and it does not define any interface for exposing the results of the analysis. No personal or sensitive information is processed as part of the algorithm. No information is exposed by the algorithm to any origin. No scripts are loaded or processed as part of the algorithm.

The only missing section is the content of the privacy and security section, for which I propose #13, including the above analysis and a subsection advising implementers to be well behaved.

nigelmegitt commented 2 years ago

Architecture

To progress an issue with TAG for review, we need an explainer.

nigelmegitt commented 2 years ago

@palemieux (and anyone else interested) status update: I've gone through each of the items required for horizontal review, and proposed responses to each of them. If you agree, please tick the box in the initial comment at the top of the issue, or if not, let's discuss.

We need to add content to the privacy and security section: I suggest we handle the details of that via comments on #13, or if you think something much different will suffice, please open a competing pull request. We should not tick that box until we've merged a pull request.

We also need an Explainer - I do not think we have one at the moment.

palemieux commented 2 years ago

@nigelmegitt Re: i18n, the specification makes assumption on the relative rendering complexity of scripts. Is it worth noting? Perhaps something along the lines of:

The purpose of the IMSC-HRM specification is to allow subtitle and caption authors and providers to verify that the content they provide does not exceed defined complexity levels, so that playback systems can render the content synchronised with the author-specified display times. In order to define complexity levels, the specification makes assumptions on the rendering complexity of various scripts._

nigelmegitt commented 2 years ago

@palemieux agreed, I've updated https://github.com/w3c/imsc-hrm/issues/12#issuecomment-972920219 to include that text.

palemieux commented 2 years ago

@nigelmegitt I can take a stab at the explainer.

palemieux commented 2 years ago

@nigelmegitt Draft explainer below. @mikedo your input on the ATSC mention below would be appreciated.

IMSC Hypothetical Render Model (HRM)

Introduction

The IMSC Hypothetical Render Model (HRM) constrains the processing complexity of subtitle and caption documents that conform to the IMSC Recommendation.

The HRM is not a new concept: it has been included in all versions and editions of the IMSC Recommendation and has remained substantially unchanged.

In order to simplify future maintenance, the TTWG wishes to refactor the HRM into its own Recommendation.

Goals

The objecive of the HRM is to allow subtitle and caption authors and providers to verify that the content they provide does not exceed defined complexity levels, so that playback systems can render the content synchronised with the author-specified display times.

Non-goals

The HRM does not specify:

User research

The HRM was included in the IMSC Recommendation based on market research conducted in the context of its predecessor specification (CFF-TT). This research demonstrated that, unless constrained in complexity, a syntactically valid IMSC document could not be guaranteed to be reliably rendered on all client devices since they do not share identical computing power. For example, a television typically has a fraction of the computing power available to a desktop PC.

More recently, experience deploying IMSC in ATSC 3.0 systems demonstrated that, in absence of an HRM, it is trivial to convert legacy CEA 608/708 captions to IMSC documents whose complexity exceed the capabilities of client devices.

Design

The HRM specifies an (hypothetical) time required for painting subtitles and captions. Painting includes drawing region backgrounds, rendering and copying glyphs, and decoding and copying images. Complexity is then limited by requiring that the time to paint a subtitle/caption is shorter than the time elapsed since the previous subtitle/caption.

Stakeholder Feedback

Stakeholder interest has resulted in the creation of an open source implementation of the HRM.

nigelmegitt commented 2 years ago

Thanks @palemieux looks like a great start.

Suggestions:

In Non-goals, add:

  • IMSC processor and renderer performance requirements.
  • Editorial requirements for example subtitle and caption reading rates

In Design :

Add that the model is intended to generate an estimate from static analysis of the document and that there's no intent to require actual rendering. For example, no font fetching or glyph rendering is performed. For example:

The calculation of time taken to paint is based on static analysis of the IMSC document and requires no fetching of external resources or glyph rendering. It is not intended to be an accurate calculation of rendering time, but a useful indicator of document complexity.

Add sub-heading:

Accessibility, Privacy and Security

The purpose of this work is to enable content providers to maximise the chance that their captions and subtitles function as designed, improving the accessibility of the media supply chain as a whole. The IMSC-HRM does not specify any human-targeted outputs and therefore has no direct accessibility implications in itself.

Since the IMSC-HRM defines the static analysis of a resource, with no specified information flowing to any origin server during the analysis, there are no direct implications or concerns regarding privacy and security.

I think it would be helpful if we could supply one or two examples too. For example a snippet of an IMSC document that generates an ISD whose paint time is, say, 400ms (roughly) and show how if the begin time of this ISD is <400ms after the begin of the previous ISD, then that's a fail, but it if is more, then it's a pass. Previous experience has been that a concrete example is much easier for a new reader to understand than a theoretical or abstract explanation, for most humans.

palemieux commented 2 years ago

[Update 2021-11-26 09:19 UTC] The explainer text in below, in this comment, has now been copied into https://github.com/w3c/imsc-hrm/blob/main/explainer.md so any changes to it now should be made by pull request


@nigelmegitt Revised explainer below. I did not include "IMSC processor and renderer performance requirements" to non-goals since the HRM indirectly implies performance requirements since renderers should probably be able to render documents that passes the HRM.

IMSC Hypothetical Render Model (HRM)

Introduction

The IMSC Hypothetical Render Model (HRM) constrains the processing complexity of subtitle and caption documents that conform to the IMSC Recommendation.

The HRM is not a new concept: it has been included in all versions and editions of the IMSC Recommendation and has remained substantially unchanged.

In order to simplify future maintenance, the TTWG wishes to refactor the HRM into its own Recommendation.

Goals

The objecive of the HRM is to allow subtitle and caption authors and providers to verify that the content they provide does not exceed defined complexity levels, so that playback systems can render the content synchronised with the author-specified display times.

Non-goals

The HRM does not specify:

User research

The HRM was included in the IMSC Recommendation based on market research conducted in the context of its predecessor specification (CFF-TT). This research demonstrated that, unless constrained in complexity, a syntactically valid IMSC document could not be guaranteed to be reliably rendered on all client devices since they do not share identical computing power. For example, a television typically has a fraction of the computing power available to a desktop PC.

More recently, experience deploying IMSC in ATSC 3.0 systems demonstrated that, in absence of an HRM, it is trivial to convert legacy CEA 608/708 captions to IMSC documents whose complexity exceed the capabilities of client devices.

Design

The HRM specifies an (hypothetical) time required for painting subtitles and captions. Painting includes drawing region backgrounds, rendering and copying glyphs, and decoding and copying images. Complexity is then limited by requiring that the time to paint a subtitle/caption is shorter than the time elapsed since the previous subtitle/caption.

The calculation of the time required for painting subtitles and captions is based on a static analysis of the IMSC document and requires no fetching of external resources or glyph rendering. It is not intended to be an accurate calculation of rendering time, but a proxy of document complexity.

Stakeholder Feedback

Stakeholder interest has resulted in the creation of an open source implementation of the HRM.

Accessibility, Privacy and Security

The purpose of this work is to enable content providers to maximise the chance that their captions and subtitles function as designed, improving the accessibility of the media supply chain as a whole. The HRM does not specify any human-targeted outputs and therefore has no direct accessibility implications in itself.

Since the HRM defines the static analysis of a resource, with no specified information flowing to any origin server during the analysis, there are no direct implications or concerns regarding privacy and security.

Example

<?xml version="1.0" encoding="UTF-8"?>
<tt xml:lang="en"
    xmlns="http://www.w3.org/ns/ttml"
    xmlns:tts="http://www.w3.org/ns/ttml#styling">
  <head>
    <layout>
      <region xml:id="r1" tts:extent="100% 100%"/>
    </layout>
  </head>
  <body region="r1">
    <div>
      <p begin="0s" end="1s">
        <span>hello</span>
      </p>
      <p begin="1s" end="2s" xml:lang="fr">
        <span>bonjour bonjour</span>
      </p>
    </div>
  </body>
</tt>

For the first subtitle (hello):

rendering time = time to render {"h", "e", "l", "o"} + time to copy {"l"}
               = 1/15 * 1/15 * (4 / 1.2 + 1 / 12)
               = 0.0152s

For the second subtitle (bonjour bonjour):

rendering time = time to clear the screen + time to render {"b", "n", "j", "u", "r", " "}
                    + time to copy {"o", "o", "b", "o", "n", "j", "o", "u", "r"}
               = 1/12 + 1/15 * 1/15 * (6 / 1.2 + 9 / 12)
               = 0.1089s

The document above passes the HRM since:

nigelmegitt commented 2 years ago

Thanks @palemieux understood re the implementation performance point.

Couple of small questions:

The document above passes the HRM since the rendering time of the second subtitle (0.1089s) is less than the duration of the first subtitle (0.0152s).

  1. Isn't the duration of the first subtitle 1s (begin time 0s to end time 1s)?

  2. Should we add something about the rendering time of the first subtitle and the requirement to be less than IPD, i.e. 1s? So that passes too.

palemieux commented 2 years ago

Isn't the duration of the first subtitle 1s (begin time 0s to end time 1s)?

Doh. Bad copy-paste. Fixed.

Should we add something about the rendering time of the first subtitle and the requirement to be less than IPD, i.e. 1s? So that passes too.

I thought that would be too detailed for an explainer.

nigelmegitt commented 2 years ago

I thought that would be too detailed for an explainer.

I think it would be good. Having time to display the first subtitle/caption is important. Also by including it, we'd have covered all the passing scenarios for a text document in one simple example.

nigelmegitt commented 2 years ago

Oh, just realised something else: with that example someone is going to ask for an xml:lang="fr" on the second p!

palemieux commented 2 years ago

@nigelmegitt Modified as suggested.

nigelmegitt commented 2 years ago

Great, thanks @palemieux I think we're ready to request HR as soon as #13 is merged.

himorin commented 2 years ago

hi, how about to add proposed explainer with a separate file as ones in other repositories, like performance-timeline?

nigelmegitt commented 2 years ago

@himorin PR open for that at #14.

nigelmegitt commented 2 years ago

Explainer now at https://github.com/w3c/imsc-hrm/blob/main/explainer.md - merging the pull request closed the issue, reopening because we're still waiting on #13

nigelmegitt commented 2 years ago

https://github.com/w3c/imsc-hrm/issues/12#issuecomment-974185428 updated to point to new explainer.md document.

nigelmegitt commented 2 years ago

Merged #13 so this work is now essentially complete. I added one further task to publish a new WD incorporating the change in that pull request.

@himorin did we set up automatic WD republication for this repo? I think we agreed to do so.

nigelmegitt commented 2 years ago

Reference to WG resolution to set up automatic republication on merge to master: https://www.w3.org/2021/10/14-tt-minutes.html#r02

himorin commented 2 years ago

@himorin did we set up automatic WD republication for this repo? I think we agreed to do so.

Aaah, yes! sorry that I totally forgot to include that in my post-FPWD updates...

himorin commented 2 years ago

filed self review results (and added review requests):

@nigelmegitt do you want to include real "Self-Review Questionnaire: Security and Privacy" results into #19 ?

xchange11 commented 2 years ago

Accessibility

None of the questions in the accessibility questionnaire apply to the IMSC-HRM specification. For all of the "If ..." rows in the table, the answer is that that they do not apply.

Looking at the related Media Accessibility Checklist linked from the accessibility questionnaire, I think we can say something like:

  • The purpose of the IMSC-HRM specification is to allow subtitle and caption authors and providers to verify that the content they provide does not exceed defined complexity levels, so that playback systems can render the content synchronised with the author-specified display times. This supports the delivery and playback of content so that it can meet the CC-* requirements.

One focus of the questionnaire is the customization of content. Although customization is out of scope of TTML/IMSC it is a common use case that before rendering IMSC content the user has some control over styling features such as color and font characteristics. A comment in the IMSC HRM could add that to test rendering of IMSC documents after customization, transformed versions of the original IMSC document with thresholds of styling features (e.g. specific font size) need to be provided to the HRM.

himorin commented 2 years ago

just for memo: diff from initial text (cut out from existing IMSC 1.2 spec) to current text: https://services.w3.org/htmldiff?doc1=https%3A%2F%2Fraw.githubusercontent.com%2Fw3c%2Fimsc-hrm%2F694ab17c5b8d571db74529689e41b23453c3b6f8%2Fspec%2Fimsc-hrm.html&doc2=https%3A%2F%2Fraw.githubusercontent.com%2Fw3c%2Fimsc-hrm%2Fmain%2Fspec%2Fimsc-hrm.html

palemieux commented 2 years ago

A comment in the IMSC HRM could add that to test rendering of IMSC documents after customization, transformed versions of the original IMSC document with thresholds of styling features (e.g. specific font size) need to be provided to the HRM.

Since the customization is done by the client, it is outside of scope for the HRM. In other words, it is up to the client to make sure that it can render HRM-conformant documents after transformation due to user customization. There is nothing for the author to do.

xchange11 commented 2 years ago

Since the customization is done by the client, it is outside of the scope for the HRM. In other words, it is up to the client to make sure that it can render HRM-conformant documents after transformation due to user customization. There is nothing for the author to do.

I agree that it is an edge case, but a very common one. IMSC is a distribution document format so some authors may be very close to distribution and also to the client where customization is a feature. The client implementation will only consider some sample documents. Although it is technically possible that the client limit customization features based on the complexity of each document he gets, this is from my view very unlikely. At least in some workflows, it may be more feasible to check the complexity of customized IMSC documents before they are deployed to the client. The HRM would be the perfect fit to test in this use case scenario.

I don't think that any normative text is needed, but an informative note to make users aware of this possibility/problem scenario could be very helpful.

nigelmegitt commented 2 years ago

an informative note to make users aware of this possibility/problem scenario could be very helpful.

Do you want to propose some text?

xchange11 commented 2 years ago

Do you want to propose some text?

I expected this question ; ) But, yes, sure. I can draft some text. But before some feedback, if people think this is in the scope of the spec, would be helpful : )

nigelmegitt commented 2 years ago

if people think this is in the scope of the spec

Clearly, @palemieux and I agree that it is not in the scope of the spec, but an informative statement that customisation may change document presentation complexity might be okay - I understand the point you're making. I asked for text because I'm not quite sure what it is you think we should say, and that seemed like a good way to find out!

xchange11 commented 2 years ago

I asked for text because I'm not quite sure what it is you think we should say, and that seemed like a good way to find out!

@nigelmegitt @palemieux Find below a draft proposal. This could be added to the Accessibility Consideration section that may be introduced by #24.

In certain presentation contexts, a client may allow the user the ability to change certain style characteristics of timed text. For example, many users need text content to be displayed larger than the specified, not only because of un-sharp vision but also to mitigate other visual perception difficulties such as difficulty separating foreground from background. Other users with visual impairments and learning disabilities find that customizing text presentation improves their ability to distinguish letters.

Because user customization of timed text from an IMSC document instance occurs after the document is authored, this scenario is outside of the scope of this specification. However, since the customization occurs before rendering, increased document complexity could be measured by the HRM. If the customization scenario for an IMSC document instance is known, the author can test document complexity in an additional step with a transformed IMSC document instance to which thresholds of customized style features such as the maximum font size are applied. This could help to avoid unintended side effects of user customization.

nigelmegitt commented 2 years ago

Thanks @xchange11 - it's clear now what kind of change you are proposing. I think we should aim for something simpler and more concise - for me, your proposal has too much detail, which distracts from the main point. It is probably also worth referencing the MAUR sections on Closed Captioning, e.g. CC-9 and CC-11, as the basis for supporting user customisation.

I think every presentation processor that needs to support documents that meet the HRM constraints must do so as a minimum : the suggestion from your text is that document authors, who cannot know this, somehow deliberately author to a lower complexity level in order to compensate for presentation processors that offer user customisation at the expense of downgrading some aspect of the presentation, e.g. synchronisation.

I don't like the idea that authors need to worry about that: rather, the implementers of the presentation processors that offer customisation must ensure that those processors have the capability to render successfully even the most complex IMSC Document Instance, that is on the threshold of failing the HRM, even when user customisation options have been enabled.

It's also not always true that:

the customization occurs before rendering

The customisation work I have done has occurred during the rendering phase, rather than before.

the author can test document complexity in an additional step with a transformed IMSC document instance

What about the complexity of the transformation itself? The author cannot test for that.

I would propose something different, targeted at presentation processor implementers, for example this, as a note:

Implementers of presentation processors should ensure that those processors are able to present Document Instances that meet the HRM constraints, even if user customisation choices, e.g. those defined by [maur], effectively increase the complexity of presentation.

andreastai commented 2 years ago

@nigelmegitt Thanks a lot for your detailed feedback.

I think we should aim for something simpler and more concise - for me, your proposal has too much detail

Sure. I wanted to give context because it may not be clear to everybody. But happy with a simpler solution.

It is probably also worth referencing the MAUR sections on Closed Captioning, e.g. CC-9 and CC-11, as the basis for supporting user customisation.

Very good idea!

the suggestion from your text is that document authors, who cannot know this, somehow deliberately author to a lower complexity level in order to compensate for presentation processors that offer user customisation at the expense of downgrading some aspect of the presentation, e.g. synchronisation.

With cannot know this do you mean that authors cannot know what kind of user customization is applied? If this is case: that may be true for some scenarios but not all. Often documents are authored/generated for distribution channels in the same organisation and the customization features are known or this information could be retrieved.

A good point about the negative impact it could have on user experience. This needs to be avoided.

I don't like the idea that authors need to worry about that: rather, the implementers of the presentation processors that offer customisation must ensure that those processors have the capability to render successfully even the most complex IMSC Document Instance, that is on the threshold of failing the HRM, even when user customisation options have been enabled.

I think it is good that both sides are aware of potential problems.

Good point about the "guidelines" for presentation processor implementers. One side effect of this would be a shift of the responsibility from author to presentation processor implementor. Another aspect to keep in mind is that if it is the responsibility of the client/presentation processor implementer this could result in downgrading accessibility features e.g. by setting a lower threshold for font size. It may be helpful to get more views on this.

What about the complexity of the transformation itself? The author cannot test for that.

This would indeed be out of the scope of the HRM.

I would propose something different, targeted at presentation processor implementers, for example this, as a note:

Implementers of presentation processors should ensure that those processors are able to present Document Instances that meet the HRM constraints, even if user customisation choices, e.g. those defined by [maur], effectively increase the complexity of presentation.

Thanks for the proposal. It is a good summary of one option but as mentioned above I am not fully convinced yet that the author does not need to care about customization. Did you use normative language on purpose? I had an informative note in mind.

nigelmegitt commented 2 years ago

With cannot know this do you mean that authors cannot know what kind of user customization is applied?

Yes, in general authors do not know what customisation options will be applied. Some constraints may be known to the author (e.g. the BBC's constraint on authoring for the largest size) but even then, a new customisation option may be developed after the document instance was authored. It's true that within a closed system there is more control, but I don't think we need to specify that - if there is effectively a local arrangement that means the authors have to set a different complexity level, I think that's too far beyond the scope of the HRM.

Thinking about the font size customisation specifically, that is actually one area where customising to make it smaller always appears to reduce document complexity, so that might be one thing worth calling out. I'm not sure if it would be worthwhile though.

One side effect of this would be a shift of the responsibility from author to presentation processor implementor.

I'm arguing that the presentation processor implementer already has all the responsibility here.

Did you use normative language on purpose?

Yes, I think a "should" in an informative note is permitted and appropriate here. But also we could use other non-normative-language to mean the same thing. Possibly we should!

andreastai commented 2 years ago

Thanks @nigelmegitt for your detailed feedback and thoughts. That is very helpful! Your reasoning makes sense to me. I think it would be good to have some more opinions on this especially about:

We may also want to open a separate issue for this and take your proposed re-wording as a starting point.

nigelmegitt commented 1 year ago

Opened w3ctag/design-reviews#788 to request TAG review.

In order to do that, I opened #57 to add a security and privacy review.

himorin commented 1 year ago

@nigelmegitt propose to close this per all HR review are resolved.

nigelmegitt commented 1 year ago

HR now complete, closing.