w3c / svgwg

SVG Working Group specifications
Other
694 stars 131 forks source link

tspan and text shaping #634

Open RazrFalcon opened 5 years ago

RazrFalcon commented 5 years ago
<svg viewBox="0 0 200 200" xmlns="http://www.w3.org/2000/svg" font-family="Arial" font-size="48">
    <path id="crosshair" d="M 20 100 L 180 100 M 100 20 L 100 180" stroke="gray" stroke-width="0.5"/>

    <text id="text1" x="50" y="100">
        T<tspan fill="green">e</tspan>xt
    </text>

    <text id="text2" x="50" y="140">
        Text
    </text>

    <rect id="frame" x="1" y="1" width="198" height="198" fill="none" stroke="black"/>
</svg>

test13

According to the SVG spec, text shaping should be done to the whole text chunk, which in our case is the whole word. But looks like only Firefox does this right. Is it so or am I missing something?

RazrFalcon commented 5 years ago

If we try to change a font than even Firefox fails.

<svg viewBox="0 0 200 200" xmlns="http://www.w3.org/2000/svg" font-family="Arial" font-size="64">
    <path id="crosshair" d="M 20 100 L 180 100 M 100 20 L 100 180
                            M 79 20 L 79 180 M 121 20 L 121 180"
          stroke="gray" stroke-width="0.5"/>

    <text id="text1" x="100" y="100" text-anchor="middle" fill="red">
        AVA
    </text>

    <text id="text2" x="100" y="100" text-anchor="middle">
        A<tspan font-weight="bold">V</tspan>A
    </text>

    <rect id="frame" x="1" y="1" width="198" height="198" fill="none" stroke="black"/>
</svg>

e-tspan-024

Since everyone failed (except my own library) I'm not even sure what should be done in this case. At least I think that the expected result is the one that resvg produces.

PS: Batik doesn't support text shaping at all, so we can ignore it.

msand commented 5 years ago

It seems they all consider it separate text chunks and ignore kerning, how about if you use a ligature like 'fi', I would assume they render separate glyphs because of being in different dom nodes

fsoder commented 5 years ago

In the latter example, there are two different faces involved ("Arial" and "Arial Bold" [1]), and AFAIK GPOS/kern will only apply within the same face. (Compare with if the <tspan> had a different font-family.)

[1] Or whatever "Arial" ends up mapped to. Even if an implementation was to synthesize bold, it'd need to modify the metrics as well and in practice treat it as a separate face.

RazrFalcon commented 5 years ago

@fsoder Yes, it's actually to separated font files. The problem is that I don't understand how it should be handled. What is the correct/expected result?

RazrFalcon commented 5 years ago

@msand I'm not sure how to define it.

<svg viewBox="0 0 200 200" xmlns="http://www.w3.org/2000/svg" font-family="Verdana" font-size="48">

    <path id="crosshair" d="M 20 100 L 180 100 M 100 20 L 100 180" stroke="gray" stroke-width="0.5"/>

    <!-- \x66\x69 -->
    <text id="text1" x="100" y="60" text-anchor="middle">
        fi
    </text>

    <!-- \x66\xE2\x80\x8C\x69 -->
    <text id="text2" x="100" y="100" text-anchor="middle">
        f‌i
    </text>

    <!-- \xEF\xAC\x81 -->
    <text id="text3" x="100" y="140" text-anchor="middle">
        fi
    </text>

    <!-- \x66\xE2\x80\x8C\x69 -->
    <text id="text4" x="100" y="180" text-anchor="middle">
        f<tspan font-weight="bold">i</tspan>
    </text>

    <rect id="frame" x="1" y="1" width="198" height="198" fill="none" stroke="black"/>
</svg>

e-tspan-024

Not sure what to expect, but Inkscape and QtSvg definitely have some problems.

fsoder commented 5 years ago

... What is the correct/expected result?

Assuming that kerning would not apply across different faces, then I'd say that the renderings that don't apply kerning would seem to be the expected ones. One might also argue that - assuming support for the font-kerning property - that all renderings are correct (since the initial value of said property is auto and leaves it to the UA to decide if kerning should be applied or not - and maybe resvg is applying kerning optically). I suspect that this is more in territory of the CSS Fonts and CSS Text specifications. The latter says:

Text shaping must not be broken across inline box boundaries when there is no effective change in formatting, or if the only formatting changes do not affect the glyphs (as in applying text decoration).

Text shaping should not be broken across inline box boundaries otherwise, if it is reasonable and possible for that case given the limitations of the font technology.

https://drafts.csswg.org/css-text-3/#boundary-shaping

(Example 33 just after the quoted text gives an example [the last] that is close to what the latter example is about.)

RazrFalcon commented 5 years ago

So basically, we can do whatever we want? But in the first example, Firefox is still the correct one, because a color change doesn't affect glyphs?

CSS Text specifications

New docs are impossible to navigate...

RazrFalcon commented 5 years ago

@fsoder actually, the main problem is that it breaks BIDI reordering (see https://github.com/w3c/svgwg/issues/635) as well.

fsoder commented 5 years ago

... in the first example, Firefox is still the correct one, because a color change doesn't affect glyphs?

Yes.

... the main problem is that it breaks BIDI reordering (see #635) as well.

I think the same applies there. (Firefox and Batik appear to render that correctly; librsvg is not reordering correctly, and others don't shape as expected.)

msand commented 5 years ago

@RazrFalcon Was thinking mainly about testing with some font where fi is ligaturized, e.g. Times, and comparing with e.g. fe, to see how the handling of text chunks / ligatures / kerning / shaping is affected by other attributes and by being in different dom nodes or not, such as this:

<svg viewBox="0 0 200 200" xmlns="http://www.w3.org/2000/svg" font-family="Times" font-size="48">

  <path id="crosshair" d="M 20 100 L 180 100 M 100 20 L 100 180" stroke="gray" stroke-width="0.5" />

  <!-- \x66\x69 -->
  <text id="text1" x="100" y="60" text-anchor="middle">
    fi
  </text>

  <!-- \x66\x69 -->
  <text id="text2" x="100" y="100" text-anchor="middle">
    f<tspan fill="green">i</tspan>
  </text>

  <!-- \x66\x65 -->
  <text id="text3" x="100" y="140" text-anchor="middle">
    fe
  </text>

  <!-- \x66\x65 -->
  <text id="text4" x="100" y="180" text-anchor="middle">
    f<tspan fill="green">e</tspan>
  </text>

  <rect id="frame" x="1" y="1" width="198" height="198" fill="none" stroke="black" />
</svg>

Btw, do you have some script you're using to create images for all the renderers? Would you mind sharing it?

RazrFalcon commented 5 years ago

It's not a script, but an GUI app that I use to test resvg. It's available here, but it's not ready for a public use. Main problem is to build and install all dependencies (e.g. batik, headless chrome, latest librsvg), which can be non-trivial.

Results:

test20

Only Qt is different/broken.

msand commented 5 years ago

@RazrFalcon Seems none of them are using the ligaturized glyph, here's chrome locally:

screen shot 2019-01-28 at 19 08 32
msand commented 5 years ago

Could your version of Times be missing the GSUB tables? https://docs.microsoft.com/en-us/typography/opentype/spec/gsub http://ilovetypography.com/OpenType/opentype-features.html

RazrFalcon commented 5 years ago

Maybe some OS interference. I'm on linux.

msand commented 5 years ago

Seems firefox preserves the ligature, even if it's different dom nodes:

screen shot 2019-01-28 at 19 33 06

Safari renders the same as chrome

RazrFalcon commented 5 years ago

Wow! Looks cool. But the question is what behavior is required by the standard.

msand commented 5 years ago

Well, at least here it seems to depend on how one interprets a "DOM text node / separated by markup", if it includes tspan elements then firefox seems to break the definition of text chunk & ligatures at least, otherwise, if only text elements are text nodes, then chrome is breaking the spec: https://www.w3.org/TR/SVG2/text.html#TermTextChunk

text chunk An independent block of text in which all characters are positioned together. Each new absolute positioning adjustment (due to an ‘x’ or ‘y’ attribute, or forced line break) creates a new text chunk. Ligature substitution and bidi-reordering only occur within a text chunk. Text chunks are only relevant to pre-formatted text.

And later: https://www.w3.org/TR/SVG2/text.html#FontsGlyphs

Ligatures are an important feature of advance text layout. Some ligatures are discretionary while others (e.g. in Arabic) are required. The following explicit rules apply to ligature formation:

Ligature formation should not be enabled when characters are in different DOM text nodes; thus, characters separated by markup should not use ligatures.

Ligature formation should not be enabled when characters are in different text chunks.

Discretionary ligatures should not be used when the spacing between two characters is not the same as the default space (e.g. when letter-spacing has a non-default value, or text-align has a value of justify and text-justify has a value of distribute). (See CSS Text Module Level 3, ([css-text-3]).

And in the end of: https://www.w3.org/TR/SVG2/text.html#GlyphsMetrics

While kerning or ligature processing might be font-specific, the preferred model is that kerning and ligature processing occurs between combinations of characters or glyphs after the characters have been re-ordered.

Seems the svg spec doesn't specify much about when kerning gets enabled / disabled by other attributes. But, as long as the font-face is the same, (and perhaps the font-size as well?) then it should be possible to use the values from the kern tables. Or perhaps some other spec clarifies this?

msand commented 5 years ago

Only relevant thing I can find is the part referenced by @fsoder

7.3. Shaping Across Element Boundaries Text shaping must be broken at inline box boundaries when any of the following are true for any box whose boundary separates the two typographic character units:

Any of margin/border/padding separating the two typographic character units in the inline axis is non-zero.

vertical-align is not baseline.

The boundary is a bidi isolation boundary.

Text shaping must not be broken across inline box boundaries when there is no effective change in formatting, or if the only formatting changes do not affect the glyphs (as in applying text decoration).

Text shaping should not be broken across inline box boundaries otherwise, if it is reasonable and possible for that case given the limitations of the font technology.

So if the same glyphs and metrics would have been chosen, when authored without separating markup, then it must not break shaping. Nothing seems to specify how to handle scaling of kerning if the glyphs have different font-size/weight etc. thus it seems safe to ignore kerning at that point. And, e.g. color shouldn't disable kerning, from a strict reading of that at least. But, seems the de-facto standard way to implement the spec has been to disable kerning if capitalized characters are separated by markup from characters they would otherwise have kerning adjustments for. As can be seen from your Text and AVA examples. But, they preserve kerning for lower case characters, such as fe and fi in my example.

msand commented 5 years ago

Hmm, seems the kerning is affected even with lower case:

<svg viewBox="0 0 200 200" xmlns="http://www.w3.org/2000/svg" font-family="Arial" font-size="48" style="font-kerning: normal">
    <path id="crosshair" d="M 20 100 L 180 100 M 100 20 L 100 180" stroke="gray" stroke-width="0.5"/>

    <text id="text1" x="100" y="100" text-anchor="middle">
        f<tspan>fi</tspan>
    </text>

    <text id="text2" x="100" y="140" text-anchor="middle">
        ffi
    </text>

    <rect id="frame" x="1" y="1" width="198" height="198" fill="none" stroke="black"/>
</svg>

What looked like kerning was just glyphs rendering further than their advance.

msand commented 5 years ago

And, here again, Firefox seems to apply kerning if the glyphs and metrics are the same. But, all kinds of font- text- styles can disable kerning if they are applied on the tspan, while not necessarily including changes to neither glyphs nor metrics.

AmeliaBR commented 5 years ago

Some thoughts:

Tavmjong commented 5 years ago

Firefox's behavior in painting a glyph with two different colors is definitely interesting... but it would probably yield unexpected results for complex scripts where reordering is possible.

css-meeting-bot commented 5 years ago

The SVG Working Group just discussed tspan and text shaping.

The full IRC log of that discussion <krit> Topic: tspan and text shaping
<krit> GitHub: https://github.com/w3c/svgwg/issues/634
<krit> See discussions on https://github.com/w3c/svgwg/issues/635 for details
<krit> AmeliaBR: previous resolution to BIDI and harominzing with CSS Text does cover this issue.
<krit> AmeliaBR: CSS guidance has more nuances
<chris> I think that was the best we could assume, *at the time*
<krit> AmeliaBR: Harminizing means we drop the rule that any markup boundaries disables ligatures
<krit> s/Harminizing/Harmonizing/
<krit> AmeliaBR: some cases might not be defined specifically at the end. CSS ends up with auto a lot.
<krit> AmeliaBR: Should it be possible to paint half of the ligature in another color than the other? Firefox does it.
<krit> AmeliaBR: it divides the offset distances in half
<krit> Tav: this may cause trouble for more complicated glyphs
<krit> myles: agree.
<krit> myles: anything like that is too complicated.
<krit> Tav: we wouldn't implement it
<krit> AmeliaBR: what should happen?
<krit> AmeliaBR: what if the color is dynamic on selection?
<krit> myles: I would prefer to leave this up to the implementation. Firefox behaviour shouldn
<krit> t be illegal but wouldn't want to implement it. So spec should not dictate one or the other way.
<krit> krit: need to resolve?
<krit> AmeliaBR: maybe we tie this into CSS Text
<krit> myles: if this WG thinks one behavior is better than the other, CSS should do the same.
<krit> krit: does sound like we don't need another resolution. We would do what ever CSS does.
<krit> krit: this is covered by harmonising resolution.
RazrFalcon commented 2 years ago

Here is an another example which doesn't rely on ligatures:

<svg id="svg1" viewBox="0 0 200 200" xmlns="http://www.w3.org/2000/svg" font-family="Noto Sans" font-size="48">
    <path id="crosshair" d="M 20 100 L 180 100 M 100 20 L 100 180" stroke="gray" stroke-width="0.5"/>
    <text id="text1" x="85" y="100">и<tspan fill="green">̆ </tspan></text>
    <text id="text2" x="85" y="145">й</text>
    <rect id="frame" x="1" y="1" width="198" height="198" fill="none" stroke="black"/>
</svg>

Results:

e-text-042

I'm surprised that basically everyone split the ̆, while both codepoints are in the same text chunk and should be normalized into a single codepoint/grapheme.