notofonts / latin-greek-cyrillic

Noto Latin, Greek, Cyrillic
SIL Open Font License 1.1
41 stars 8 forks source link

U+0337 & U+0338 Combining Solidus Overlay Misplaced on Many Characters #105

Closed kmuncie closed 1 year ago

kmuncie commented 4 years ago

U+0337 & U+0338 Combining Solidus Overlay Misplaced

Font

NotoSans-Regular.ttf

This issue has also been noted in NotoSans-Bold and is assumed to exist in other variants as well.

Where the font came from, and when

https://github.com/googlefonts/noto-fonts/blob/master/phaseIII_only/hinted/ttf/NotoSans/NotoSans-Regular.ttf

Font Version

Version 2.003; ttfautohint (v1.8.2)

Issue

When U+0337 Combining Short Solidus Overlay or U+0338 Combining Long Solidus Overlay is used it is not rendered correctly in most cases.

This issue was discovered when trying to render text in the Chinantec (Ojitlan) language which makes extensive use of the character e with U+0037 Combining Short Solidus Overlay.

An example of a word in this languages is "köse̷".

Note how this word is rendered in Noto Sans Regular:

noto-sans-solidus

Now, comapre this to the rendering in Deja Vu Sans:

deja-vu-soldius

Character data

Test string for U+0337 Combining Short Solidus Overlay

A̷ a̷ B̷ b̷ C̷ c̷ D̷ d̷ E̷ e̷ F̷ f̷ G̷ g̷ H̷ h̷ I̷ i̷ J̷ j̷ K̷ k̷ L̷ l̷ M̷ m̷ N̷ n̷ O̷ o̷ P̷ p̷ Q̷ q̷ R̷ r̷ S̷ s̷ T̷ t̷ U̷ u̷ V̷ v̷ W̷ w̷ X̷ x̷ Y̷ y̷ Z̷ z̷ 0̷ 1̷ 2̷ 3̷ 4̷ 5̷ 6̷ 7̷ 8̷ 9̷

Test string for U+0338 Combining Long Solidus Overlay

A̸ a̸ B̸ b̸ C̸ c̸ D̸ d̸ E̸ e̸ F̸ f̸ G̸ g̸ H̸ h̸ I̸ i̸ J̸ j̸ K̸ k̸ L̸ l̸ M̸ m̸ N̸ n̸ O̸ o̸ P̸ p̸ Q̸ q̸ R̸ r̸ S̸ s̸ T̸ t̸ U̸ u̸ V̸ v̸ W̸ w̸ X̸ x̸ Y̸ y̸ Z̸ z̸ 0̸ 1̸ 2̸ 3̸ 4̸ 5̸ 6̸ 7̸ 8̸ 9̸

Harfbuzz hb-view

Example character data above as generated through hb-view:

soldius-test

dscorbett commented 4 years ago

This should be fixed, but Unicode discourages the use of U+0337 and U+0338 in orthographies. For example, the letter motivating this bug report is U+0247 ⟨ɇ⟩ LATIN SMALL LETTER E WITH STROKE, which Noto supports.

kmuncie commented 4 years ago

This should be fixed, but Unicode discourages the use of U+0337 and U+0338 in orthographies. For example, the letter motivating this bug report is U+0247 ⟨ɇ⟩ LATIN SMALL LETTER E WITH STROKE, which Noto supports.

Thank you for the information. For reference I located information about this in Unicode 13.0.0 in section 7.9 .

Overlaid Diacritics. A few combining marks are encoded to represent overlaid diacritics such as U+0335 combining short stroke overlay (= “bar”) or hooks modifying the shape of base characters, such as U+0322 combining retroflex hook below. Such over-laid diacritics are not used in decompositions of characters in the Unicode Standard. Over-laid combining marks for the indication of negation of mathematical symbols are an exception to this rule and are discussed later in this section.One should use the combining marks for overlaid diacritics sparingly and with care, as rendering them on letters may create opportunities for spoofing and other confusion.Sequences of a letter followed by an overlaid diacritic or hook character are not canonically equivalent to any preformed encoded character with diacritic even though they may appear the same. See “Non-decomposition of Certain Diacritics” in Section 2.12, Equivalent Sequences for more discussion of the implications of overlaid diacritics for normalization and for text matching operations.

nizarsq commented 4 years ago
Screen Shot 2020-07-24 at 10 00 40 PM
nizarsq commented 4 years ago

Issue seems to be reproducible on NotoSerif and NotoSansMono.

simoncozens commented 1 year ago

We need to go through every base glyph and add a center anchor. Ideally this should be positioned by a designer, but I'm tempted to script it.

simoncozens commented 1 year ago

Actually with all the fonts, upright and italic, and all the masters, there would be around 25,000 anchors needed, so yeah, I'm just going to script it.