w3c / imsc-hrm

IMSC Hypothetical Render Model
https://w3c.github.io/imsc-hrm/spec/imsc-hrm.html
Other
1 stars 6 forks source link

span elements are included in NBG(R_i) #5

Closed palemieux closed 8 months ago

palemieux commented 3 years ago

NBG(Ri) counts the number of tts:backgroundColor attributes specified span elements.

In a common scenario illustrated below, this results in the complexity of painting (relatively small) span backgrounds to be equal to painting the background of (relatively much larger) region that essentially fills the root container.

image

This can be addressed by excluding span from the NBG(Ri) computation, and instead including tts:backgroundColor in the list of glyph properties at https://www.w3.org/TR/ttml-imsc1.1/#paint-text.

nigelmegitt commented 3 years ago

@palemieux please could you explain more about how this would work? I don't understand the reasoning at the moment:

The paint text model is based on glyph copying, and the glyphs are copied after the background has been drawn. The background is not the same for every glyph with the listed style properties, even if tts:backgroundColor is included, because it does not always have the same size. It may be extended either in the inline or the block progression direction by use of itts:fillLineGap or ebutts:linePadding.

Even if the background were the same size, the rendering performance factor for drawing a background would need to be taken into account, and presumably would be much less than for drawing the glyph: in this proposal what change would be made to the values of Ren(Gi)?

palemieux commented 3 years ago

@nigelmegitt Do you agree that the current complexity measure, where a full region redraw is assumed on every span-specified tts:backgroundColor, does not make sense, right? In other words, complexity for span-specified tts:backgroundColor should scale with character area and not region area.

nigelmegitt commented 3 years ago

@palemieux it depends what you need to achieve. I do agree that real implementations are unlikely to redraw all of the pixels in a region n times where n is the number of background colours specified for active children of that region. At least, it's very much a worst case scenario. I could imagine some actually do have event loops that might cause this many redraws, if they update some background painting map every time they encounter a new background rectangle.

But as a proxy measure of complexity, which is easy to compute, I think it is okay to assume this worst case scenario, generate some number (which is all the HRM does, after all), and then set thresholds based on the numbers that real world content requires.

Sure, we could go to a huge amount of further detail, computing the areas of each of the elements selected into the region, and multiplying the number of pixels by the number of times each element gets drawn etc.

Let me ask you a more meta question back: what is the motivation behind increasing the level of precision? Is there some real world content that colours a lot of span backgrounds differently and falls foul of the HRM?

css-meeting-bot commented 3 years ago

The Timed Text Working Group just discussed span elements are included in NBG(R_i) w3c/imsc#571, and agreed to the following:

The full IRC log of that discussion <nigel> Topic: span elements are included in NBG(R_i) w3c/imsc#571
<nigel> github: https://github.com/w3c/imsc/issues/571
<nigel> Nigel: This is about where you count the background colour of spans.
<nigel> Pierre: The problem with the current spec is that the cost of drawing a background on span
<nigel> .. is the same as the cost of drawing a background on the entire region.
<nigel> .. That seems to scale the wrong way.
<nigel> .. Intuitively you'd think that the cost of drawing the background of a span
<nigel> .. should roughly scale with the number of characters in the span, not the area of the region.
<nigel> Nigel: My feeling is "kinda" and "maybe" and "it's not that bad is it?"
<nigel> .. I mean, some spans might be as big as a region, but it's unusual.
<nigel> .. As a worst case scenario, it's not so bad. We're just looking for a complexity value.
<nigel> Pierre: It is bad though because a p with one span, compared to the same p with the same text in multiple spans,
<nigel> .. generates a higher complexity value.
<nigel> Nigel: You mean there are documents that fail compared to the threshold now, that should pass?
<nigel> Pierre: Yes, there are cases where there are 2 documents with the same rendered output,
<nigel> .. and one fails and one passes, based on how many spans are in the p output.
<nigel> Nigel: It'd be good to see test cases.
<nigel> .. Although the rendered output is the same, that doesn't mean that the rendering complexity was the same.
<nigel> Pierre: Imagine an implementation that uses fixed width bitmap characters.
<nigel> .. The cost of applying a background to a span should never scale as the area of the region,
<nigel> .. because the background would scale with every character drawn.
<nigel> .. You would not redraw the entire background of the region every time you blit a character.
<nigel> .. it would just be the area of the character.
<nigel> Nigel: It would be useful to have test cases. Am I right that there are no HRM tests in imsc-tests?
<nigel> Pierre: Right, so the project I'm doing could probably add them.
<nigel> Nigel: Any other thoughts on this?
<nigel> SUMMARY: Continue the discussion on the issue
palemieux commented 3 years ago

@nigelmegitt Below is the kind of document I have in mind:

<?xml version="1.0" encoding="UTF-8"?>
<tt xml:lang="en"
    xmlns="http://www.w3.org/ns/ttml"
    xmlns:tts="http://www.w3.org/ns/ttml#styling"
    xmlns:ebutts="urn:ebu:tt:style">
  <head>
    <layout>
      <region xml:id="r1"
              tts:origin="10% 10%"
              tts:extent="80% 80%"
              tts:color="white"
              tts:textAlign="center"
              tts:backgroundColor="transparent"
              tts:showBackground="whenActive"
              tts:displayAlign="after"
              ebutts:linePadding="0.5c"/>
    </layout>
  </head>
  <body region="r1">
    <div>
      <p begin="0s" end="0.3s">
        <span tts:backgroundColor="black">Hello my name is</span><br/>
        <span tts:backgroundColor="black">Gaspard Hunt</span><br/>
        <span tts:backgroundColor="black">and I will <span tts:backgroundColor="red">rock</span> your world...</span>
      </p>
      <p begin="0.3s" end="2s">
        <span tts:backgroundColor="black">Gaspard Hunt</span><br/>
        <span tts:backgroundColor="black">and I will <span tts:backgroundColor="red">rock</span> your world...</span><br/>
        <span tts:backgroundColor="black">But first let's take a selfie.</span>
      </p>
    </div>
  </body>
</tt>

The <p begin="0.3s" end="2s"> has four elements with non-transparent tts:backgroundColor. Each one triggers a full redraw of the entire region, whose area is 64% of the root container region, even though each element is much smaller than the entire region. As a result the HRM draw time is 0.349s, which means that the previous <p> can be on-screen for no more than 0.349s.

The region is transparent and large so that the document can use a single region with no risk of overflow.

nigelmegitt commented 3 years ago

Thanks for that example @palemieux . It clarifies things. I'm still thinking about it, and want to share what I'm mulling over:

palemieux commented 3 years ago

@nigelmegitt I agree it is too early to reach a conclusion on this issue, and recommend waiting for more feedback from implementers and users.

btsimonh commented 3 years ago

I'm at a loss to understand why every span background would cause a cost of a complete region fill... I would expect a new buffer per time period, with a single clear and then draw on. In fact, in real world double buffered implementations, you may expect each buffer to be tagged with it's dirty rectangle, and only that area to need to be cleared - if the buffers are re-used - resulting in a much reduced pixel fill.

Note on region size: I have generally adopted the use of regions which extend from the top or bottom safe area edge to the required text position. In practice, as in pierre's example, this means large regions for every title. Small region sizes implies you know the text extents, and in this world of 'make my text bigger' in renderers, you can't know the final text sizes as the player may change that. So the best you can do is 'anchor' the text to the position you want it, and encourage the player to expand from there?

nigelmegitt commented 3 years ago

So the best you can do is 'anchor' the text to the position you want it, and encourage the player to expand from there?

You could alternatively set tts:overflow="visible" and set the appropriate value of tts:displayAlign if you don't need any other behaviour that depends on the region edges, such as background color.

nigelmegitt commented 3 years ago

I'm at a loss to understand why every span background would cause a cost of a complete region fill... I would expect a new buffer per time period, with a single clear and then draw on.

The HRM is not intended to be a representation of a performance-optimised real world renderer - in fact it probably makes sense to make it in some ways a deliberately sub-optimal model, so that even naive implementations are likely to play back conformant documents acceptably.

palemieux commented 3 years ago

so that even naive implementations are likely to play back conformant documents acceptably.

As it stands, the complexity of drawing the background of a span remains comparable to drawing the entire background of a parent region that fills the root container, even if the font size of the span is 0. I do not think believe this is correct.

nigelmegitt commented 3 years ago

the complexity of drawing the background of a span remains comparable to drawing the entire background of a parent region that fills the root container

Yes, this reflects an extremely naive renderer, for example one that proceeds linearly through rendering, writing directly to a single pixel buffer, and redraws all the background areas every time it encounters something that modifies how those background areas should be drawn.

I'm not saying that the pointer is necessarily at the right place on the dial now; just that moving it all the way to "highly optimised" would also be a mistake.

btsimonh commented 3 years ago

painting the whole region for every background colour makes no sense after reading 10.2, which explicitly groups the painting of all background colours before the painting of foregrounds.
Indeed, in my own renderer, during parse I make a list of all the boxes that need to be painted (whilst noting all the text to be drawn and it's size, 'cos else you don't know the box size), then only when all positioning is known, paint all the boxes, then the outlines, then the foregrounds. It's actually really important to separate these things, else a background/outline will overlay a foreground, and that is not good.... Indeed, I would argue for adding painting of outlines as an extra step in 10.2, with an extra render cost - because in reality that's how you have to do it.

palemieux commented 2 years ago

I'm not saying that the pointer is necessarily at the right place on the dial now; just that moving it all the way to "highly optimised" would also be a mistake.

The ratio of the cost of drawing a background behind a span over the cost of drawing a background behind the root container will approach 0 as the font size approaches 0, even for an extremely naive renderer.

nigelmegitt commented 2 years ago

I'm not saying that the pointer is necessarily at the right place on the dial now; just that moving it all the way to "highly optimised" would also be a mistake.

The ratio of the cost of drawing a background behind a span over the cost of drawing a background behind the root container will approach 0 as the font size approaches 0, even for an extremely naive renderer.

I don't think that's necessarily correct. The current terrible case naive renderer model builds a map of all the backgrounds, as it progresses linearly through the content, and every time it finds a new one, redraws the whole region background.

Anyhow, I think it's a strong point by @btsimonh that the processing requirements for a renderer mean that text layout has to happen in full before background painting can begin. Perhaps it would make sense to define the creation of a set of rectangles in different colours (and transparencies) and count the painting time of each of those, dependent on its visible area. But doing it as part of glyph painting seems problematic, for the reasons stated earlier.

Then we'd have:

  1. Layout the text
  2. Generate an ordered set of coloured rectangles from the region, body, div, p and span-generated areas.
  3. Count the paint time of each based on its size
  4. Count the glyph draw time.
btsimonh commented 2 years ago

"Perhaps it would make sense to define the creation of a set of rectangles in different colours (and transparencies)" - this is almost exactly what I do in my renderer which I wrote some time back. 1/ layout the text. 2/ create a list of the 'boxes' required to be drawn. 3/ create a list of the 'background elements' (outline, shadow, etc) with bounding boxes/locations. 4/ create a list of the textual elements with bounding boxes/locations. render the lists in order 2,3,4 For a render of EACH element (be it box, background, or text), don't forget it may need to read the existing pixel values to correctly blend. e.g. the background and text elements may well end up with an image which includes pixels which have transparency.... even the boxes may be anti-aliased (50% opacity) top/bottom to reduce interlace twitter. Non anti-aliased characters and backgrounds are just plain ugly :). I think we can state the assumptions of the HRM - and these don't have to absolutely bottom dollar... but from the above, maybe we need to consider 3 x blend operations per 'text area' rendered? (or 3 x blend operations for the occupied area of the region?)

nigelmegitt commented 2 years ago

it may need to read the existing pixel values to correctly blend. e.g. the background and text elements may well end up with an image which includes pixels which have transparency

+1 - I expect we'd factor this into the painting "speed"

nigelmegitt commented 8 months ago

Discussed in TTWG meeting today, minuted at https://www.w3.org/2023/12/21-tt-minutes.html#r01 :

Nigel: Yes, it felt like we were overestimating complexity but in practice everything has passed.

Pierre: We should keep it open but note that we are not addressing it in imsc-hrm v1.

Nigel: I think I will propose something different. … I think we should close on the basis that we haven't demonstrated that there's a real world problem. … If later someone says they have documents that they think should pass, but which don't, … then at that stage we should investigate and open an issue based on the investigation, and there's … a chance it could be this cause, so we could reopen or open a new issue. But right now … this is not a demonstrated real world problem.

Pierre: I'm fine with that. … We should note it as a comment, or record what we just discussed.

Nigel: We spent a lot of time on that issue, but I think it gets trumped by the implementation experience.

Gary: We could post this on the issue, wait a short while and then close if nobody objects.

Nigel: +1 … I will add a pointer to this conversation to the issue after the meeting and propose to close it with no change.

The proposal is to close this issue with no change on the basis that in practice, the potential overestimate of background painting time in the HRM discussed in this issue has not caused any documents to fail validation unexpectedly.

If, in the future, implementation and usage experience shows that there are documents that are expected to pass HRM validation but actually fail, then we would analyse those cases and either open new issues or, if the cause is the same, we could reopen this issue.

I will return to this after January 8th 2024 and if there are no objections, I will close this issue.