canonical / Ubuntu-Sans-fonts

Other
54 stars 4 forks source link

Many lowercase characters are mapped to uppercase glyphs #103

Open dscorbett opened 12 months ago

dscorbett commented 12 months ago

Many lowercase characters have uppercase glyphs in version 1.000 beta. This is because 'cmap' maps both the lowercase and uppercase character to the same glyph. Here is the full list:

ɓɔɖɗəɛɠɣɩɯɲɵʀʃʈʉʊʋʌⱥⱦ

djrrb commented 12 months ago

Thank you for reporting this! I am able to reproduce this.

Screenshot 2023-07-21 at 10 18 00 AM

If I’m not mistaken, looking at v0.83, it looks like the lowercase forms for these glyphs were never included in the original Ubuntu fonts. I’m not sure why the caps were included but the lowercase was not.

These characters are also not present in 1.000 beta. While updating the CMAP table, it seems that our tools double-encoded the caps, presumably as a “better-than-nothing” strategy that prevents a case transformation from introducing missing characters into a text (line 2 above).

The ideal solution would be to add lowercase glyphs and encode them correctly (as shown above with Noto, line 3).

In the immediate term, I think it is easy/reasonable to remove the double mappings, which would make the font perform like the previous version (line 1). Tagging @lyubomir-popov in case Lyubo has an opinion.

djrrb commented 11 months ago

I removed the double encodings in for these glyphs in f74f1415f90c4582d0b07c7cb87ea2f6451cfb13, so now the lowercase glyphs return a .notdef character.

Ⱥ ⱥ Ɓ ɓ Ɗ ɗ Ɖ ɖ Ɛ ɛ Ʃ ʃ Ə ə Ɠ ɠ Ɩ ɩ Ɯ ɯ Ɲ ɲ Ɵ ɵ Ɔ ɔ Ⱦ ⱦ Ʈ ʈ Ʉ ʉ Ʊ ʊ Ɣ ɣ Ʋ ʋ Ʌ ʌ Ʀ ʀ
Screenshot 2023-08-02 at 10 13 56 AM

Since the best solution is to add these lowercase glyphs, I’m recategorizing this issue as an Enhancement and pasting the glyph names below for future reference:

 /Astroke Ⱥ ⱥ
 /Bhook Ɓ ɓ
 /Dhook Ɗ ɗ
 /Dtail Ɖ ɖ
 /Eopen Ɛ ɛ
 /Esh Ʃ ʃ
 /Schwa Ə ə
 /Ghook Ɠ ɠ
 /Iota-latin Ɩ ɩ
 /Mturned Ɯ ɯ
 /Nhookleft Ɲ ɲ
 /Ocenteredtilde Ɵ ɵ
 /Oopen Ɔ ɔ
 /Tdiagonalstroke Ⱦ ⱦ
 /Tretroflexhook Ʈ ʈ
 /Ubar Ʉ ʉ
 /Upsilon-latin Ʊ ʊ
 /Gamma-latin Ɣ ɣ
 /Vhook Ʋ ʋ
 /Vturned Ʌ ʌ
 /Yr Ʀ ʀ
djrrb commented 4 months ago

This is now also reported by Fontbakery:

https://github.com/fonttools/fontbakery/issues/3230 https://github.com/simoncozens/fontbakery/blob/017d2e3d8e9b06c80e651c4f73eef8742c16fa6a/CHANGELOG.md?plain=1#L39

For example:

UbuntuSansMono-Italic[wght].ttf
  • 🔥 FAIL

    The following glyphs lack their case-swapping counterparts:

    Glyph present in the font | Missing case-swapping counterpart -- | -- U+0181: LATIN CAPITAL LETTER B WITH HOOK | U+0253: LATIN SMALL LETTER B WITH HOOK U+0186: LATIN CAPITAL LETTER OPEN O | U+0254: LATIN SMALL LETTER OPEN O U+0189: LATIN CAPITAL LETTER AFRICAN D | U+0256: LATIN SMALL LETTER D WITH TAIL U+018A: LATIN CAPITAL LETTER D WITH HOOK | U+0257: LATIN SMALL LETTER D WITH HOOK U+018F: LATIN CAPITAL LETTER SCHWA | U+0259: LATIN SMALL LETTER SCHWA U+0190: LATIN CAPITAL LETTER OPEN E | U+025B: LATIN SMALL LETTER OPEN E U+0193: LATIN CAPITAL LETTER G WITH HOOK | U+0260: LATIN SMALL LETTER G WITH HOOK U+0194: LATIN CAPITAL LETTER GAMMA | U+0263: LATIN SMALL LETTER GAMMA U+0196: LATIN CAPITAL LETTER IOTA | U+0269: LATIN SMALL LETTER IOTA U+019C: LATIN CAPITAL LETTER TURNED M | U+026F: LATIN SMALL LETTER TURNED M U+019D: LATIN CAPITAL LETTER N WITH LEFT HOOK | U+0272: LATIN SMALL LETTER N WITH LEFT HOOK U+019F: LATIN CAPITAL LETTER O WITH MIDDLE TILDE | U+0275: LATIN SMALL LETTER BARRED O U+01A6: LATIN LETTER YR | U+0280: LATIN LETTER SMALL CAPITAL R U+01A9: LATIN CAPITAL LETTER ESH | U+0283: LATIN SMALL LETTER ESH U+01AE: LATIN CAPITAL LETTER T WITH RETROFLEX HOOK | U+0288: LATIN SMALL LETTER T WITH RETROFLEX HOOK U+01B1: LATIN CAPITAL LETTER UPSILON | U+028A: LATIN SMALL LETTER UPSILON U+01B2: LATIN CAPITAL LETTER V WITH HOOK | U+028B: LATIN SMALL LETTER V WITH HOOK U+023A: LATIN CAPITAL LETTER A WITH STROKE | U+2C65: LATIN SMALL LETTER A WITH STROKE U+023E: LATIN CAPITAL LETTER T WITH DIAGONAL STROKE | U+2C66: LATIN SMALL LETTER T WITH DIAGONAL STROKE U+0244: LATIN CAPITAL LETTER U BAR | U+0289: LATIN SMALL LETTER U BAR U+0245: LATIN CAPITAL LETTER TURNED V | U+028C: LATIN SMALL LETTER TURNED V U+1E2D: LATIN SMALL LETTER I WITH TILDE BELOW | U+1E2C: LATIN CAPITAL LETTER I WITH TILDE BELOW U+1ECB: LATIN SMALL LETTER I WITH DOT BELOW | U+1ECA: LATIN CAPITAL LETTER I WITH DOT BELOW [code: missing-case-counterparts]
moyogo commented 4 months ago

@djrrb The Latin Extended B block was included in the Ubuntu fonts, whether it made sense or not to cover only uppercase characters of many pairs. The assumption was that Unicode blocks were neat complete sets, when they often aren’t. :shrug:

djrrb commented 4 months ago

Makes sense...thanks @moyogo!