notofonts / symbols

Noto Symbols
SIL Open Font License 1.1
14 stars 4 forks source link

missing symbols from code page 437 : box drawing and filled boxes characters in Unicode (U+2500..U+25A0) #19

Closed verdy-p closed 1 year ago

verdy-p commented 5 years ago

Is there a way to add the missing symbols used in DOS code page 437 (filled boxes, and box drawing characters) ?

These are the characters in code range 0xB0 to 0xDF in that codepage (with their description in French), and they are all in the same Unicode range (U+2500 to U+257F: box drawing characters, and U+2580 to U+259F: filled boxes); some of them are mapped like this in CP437:

There's also one character in CP437 which maps to a symbol in the next Unicode block for geometric forms:

For now they cannot be displayed at all with any Noto font. Only a few symbols for the Unicode block starting at U+25A0 are mapped in "Noto Sans Math", some others are defined in "Noto Sans Emoji", but there's no consistency for this block of geometric shapes.

When rendering legacy texts using CP437 codepages (or other DOS codepages) we frequently get tofu, and when using Noto fonts with monospaced styles (including CJK fonts), they are not properly aligned.

With non-monospaced fonts, all these characters should have the same metric as the non-breaking space (also mapped in CP437 at 0xFF), or as the ideographic space (in CJK fonts).

I think they should be defined in "Noto Sans Symbols" or "Noto Sans Symbols2". This does not require complex glyphs.

dscorbett commented 5 years ago

Try Noto Sans Mono.

verdy-p commented 5 years ago

Noto Sans Mono does not work well in stylesheets, because it overrides too many characters, notably most other alphabetic ones. May be then we can use Mono only as a last alternative for CSS font lists. But in msot stylesheets the mono font is simply banned

verdy-p commented 5 years ago

Note that U+2591 (25% grey pattern filled box) and U+2593 (75% grey pattern filled box) are in Mono, but U+2592 (50% grey pattern filled) is not found.

Correction it is found, but when testing Mono in last position, it is overriden by other "suitable" fonts that define their own character for 2592 only.

Noto Sans Mono does not paly well with stylesheets that are designed to separate the 3 generic font-family of CSS/HTML, and notably not with "serif" and "sans-serif". Noto Sans Mono contradicts the design goals of the Noto family, by using a multiscript approach. It just plays well with font lists designed exclusively for the standard "monospace" font-family.

This means that there's still no support of these characters with the sans-serif and serif families (note that I do not request that all fonts exists in sans-serif or serif styles, it is fine if only one of the two is defined, which makes sense for many scripts outside Latin/Greek/Cyrillic and similar, or CJK).

If I compare with windows fonts, the box drawing characters and geometric shapes characters are in Cambria Math (which is not a monospaced font), but not in Noto Sans Math or Noto Sans Symbols(2) and they work well with Segoe UI as well (making usable the Windows console for "CMD" or for shells in Linux for Windows).

verdy-p commented 5 years ago

So what I suggest is not to import ALL monospaced glyphs from "Noto Sans Mono", but reimport the few symbolic characters existing only in "Noto Sans Mono" into "Noto Sans Symbols" or "Noto Sans Symbols2" to get a consistant view, without overriding any other Latin/Greek/Cyrillic characters (or other scripts) from "Noto Sans *" fonts by the monospaced glyphs from "Noto Sans Mono" (whose coverage is too large and then cannot be used consistantly with other "Noto Sans" fonts to get a complete Unicode coverage). For now this is just 161 characters (U+2500..U+25A0, but a few more may be needed for legacy DOS/OEM pages that I did not check).

The same glyphs may also be imported to CJK fonts (possibly adapted to their ideographic metrics), unless this is not possible due to their specific Adobe licencing (and they are already very large by covering only CJK characters in the BMP; probably the additional non Han characters may fit elsewhere, notably monospaced variants of Latin/Greek/Cyrillic as well as Japanese syllabaries, Bopomofo syllabary, Hangul alphabet and composed syllables, Yi syllabary, Old Mongolian abugida, and Tangut, and keeping only Hanzi/Hanja/Kanji sinograms in CJK fonts; may be also the CJK radicals and CJK symbols or punctuation could be removed from these national CJK font variants and placed in another relevant common font for Japan, Korea and China, and without the Adobe licencing issues for Hanzi/Hanja/Kanji sinograms).

These Adobe-licenced CJK fonts would keep their specific naming, but open Noto fonts for parts not needing these legacy Adobe fallbacks should be using a consistant Noto naming like "Noto Sans [Jpan/Kore/Hans/Hant] [Supplementary]", while the existing "Noto Sans CJK [KR/JP/SC/TC]" would be used only as fallbacks (only needed for full Hanja/Kanji/Hanzi sinograms in the BMP) after the new fonts.

The design of "Noto Sans Mono" follows another goal, in fact its naming as part of the "Noto Sans" family is IMHO a design error, it should just be "Noto Mono" or "Noto Mono Sans" if you intend to provide also a monospaced font with a serif style variant, or "Noto DoubleWidth" for CJK usage (with double-width characters using two "display cells").

You may want to preserve the compatibility of font names by adding font name aliases in these fonts.

dougfelt commented 5 years ago

It seems reasonable to me to ensure that all glyphs in mono are covered by a non-monospace Noto font. Symbols2 seems a reasonable choice. I'm not directly involved anymore so some one else will have to make that decision. @marekjez86 can you bring this up?

nizarsq commented 4 years ago

Tested the following set, currently only 25A0 exist in NotoSansSymbols/NotoSansSymbols2. 2591 | LIGHT SHADE | ░ 2592 | MEDIUM SHADE | ▒ 2593 | DARK SHADE | ▓ 2502 | BOX DRAWINGS LIGHT VERTICAL | │ 2524 | BOX DRAWINGS LIGHT VERTICAL AND LEFT | ┤ 2561 | BOX DRAWINGS VERTICAL SINGLE AND LEFT DOUBLE | ╡ 2562 | BOX DRAWINGS VERTICAL DOUBLE AND LEFT SINGLE | ╢ 2556 | BOX DRAWINGS DOWN DOUBLE AND LEFT SINGLE | ╖ 2555 | BOX DRAWINGS DOWN SINGLE AND LEFT DOUBLE | ╕ 2563 | BOX DRAWINGS DOUBLE VERTICAL AND LEFT | ╣ 2551 | BOX DRAWINGS DOUBLE VERTICAL | ║ 2557 | BOX DRAWINGS DOUBLE DOWN AND LEFT | ╗ 255D | BOX DRAWINGS DOUBLE UP AND LEFT | ╝ 255C | BOX DRAWINGS UP DOUBLE AND LEFT SINGLE | ╜ 255B | BOX DRAWINGS UP SINGLE AND LEFT DOUBLE | ╛ 2510 | BOX DRAWINGS LIGHT DOWN AND LEFT | ┐ 2514 | BOX DRAWINGS LIGHT UP AND RIGHT | └ 2534 | BOX DRAWINGS LIGHT UP AND HORIZONTAL | ┴ 252C | BOX DRAWINGS LIGHT DOWN AND HORIZONTAL | ┬ 251C | BOX DRAWINGS LIGHT VERTICAL AND RIGHT | ├ 2500 | BOX DRAWINGS LIGHT HORIZONTAL | ─ 253C | BOX DRAWINGS LIGHT VERTICAL AND HORIZONTAL | ┼ 255E | BOX DRAWINGS VERTICAL SINGLE AND RIGHT DOUBLE | ╞ 255F | BOX DRAWINGS VERTICAL DOUBLE AND RIGHT SINGLE | ╟ 255A | BOX DRAWINGS DOUBLE UP AND RIGHT | ╚ 2554 | BOX DRAWINGS DOUBLE DOWN AND RIGHT | ╔ 2569 | BOX DRAWINGS DOUBLE UP AND HORIZONTAL | ╩ 2566 | BOX DRAWINGS DOUBLE DOWN AND HORIZONTAL | ╦ 2560 | BOX DRAWINGS DOUBLE VERTICAL AND RIGHT | ╠ 2550 | BOX DRAWINGS DOUBLE HORIZONTAL | ═ 256C | BOX DRAWINGS DOUBLE VERTICAL AND HORIZONTAL | ╬ 2567 | BOX DRAWINGS UP SINGLE AND HORIZONTAL DOUBLE | ╧ 2568 | BOX DRAWINGS UP DOUBLE AND HORIZONTAL SINGLE | ╨ 2564 | BOX DRAWINGS DOWN SINGLE AND HORIZONTAL DOUBLE | ╤ 2565 | BOX DRAWINGS DOWN DOUBLE AND HORIZONTAL SINGLE | ╥ 2559 | BOX DRAWINGS UP DOUBLE AND RIGHT SINGLE | ╙ 2558 | BOX DRAWINGS UP SINGLE AND RIGHT DOUBLE | ╘ 2552 | BOX DRAWINGS DOWN SINGLE AND RIGHT DOUBLE | ╒ 2553 | BOX DRAWINGS DOWN DOUBLE AND RIGHT SINGLE | ╓ 256B | BOX DRAWINGS VERTICAL DOUBLE AND HORIZONTAL SINGLE | ╫ 256A | BOX DRAWINGS VERTICAL SINGLE AND HORIZONTAL DOUBLE | ╪ 2518 | BOX DRAWINGS LIGHT UP AND LEFT | ┘ 250C | BOX DRAWINGS LIGHT DOWN AND RIGHT | ┌ 2588 | FULL BLOCK | █ 2584 | LOWER HALF BLOCK | ▄ 258C | LEFT HALF BLOCK | ▌ 2590 | RIGHT HALF BLOCK | ▐ 2580 | UPPER HALF BLOCK | ▀ 25A0 | BLACK SQUARE | ■

marekjez86 commented 4 years ago

These are now part of

shouldn't be part of NotoSansSymbols* (including 25A0), but need to take into consideration Philippe's arguments...

simoncozens commented 1 year ago

Closing in favour of consolidated report #69.