notofonts / hebrew

Noto Hebrew
SIL Open Font License 1.1
2 stars 1 forks source link

Where are Common script characters for Noto generally and for Hebrew in particular? #44

Closed markhdavid closed 4 months ago

markhdavid commented 4 months ago

I see that in Google Fonts public release, the Noto Hebrew, all three familes (Sans, Rashi, Serif), has many Common script characters, such as U+201C [“] LEFT_DOUBLE_QUOTATION_MARK. (There's a Common character missing, namely, U+201F [‟] DOUBLE_HIGH-REVERSED-9_QUOTATION_MARK, which I wanted to report as an issue.) However, in the repo, such Common script characters are completely missing. In reading the Noto fonts Readme file (https://github.com/notofonts/notofonts.github.io/blob/main/README.md), it says Noto fonts are organized by script. However, I see no entry for script "Common" in the repo. Where do I find that script or where do I find these characters? And how do I report characters missing or other defects in such characters?

markhdavid commented 4 months ago

My aim is basically to report a missing double-quote character, namely, U+201F [‟] DOUBLE_HIGH-REVERSED-9_QUOTATION_MARK, which is missing in all three publicly released Noto Hebrew fonts. My issue is similar to the one I filed for another open source font, Frank Ruhl Libre: Missing glyph for DOUBLE_HIGH-REVERSED-9_QUOTATION_MARK #25 If someone, perhaps @simoncozens, could guide me on that, I would greatly appreciate it.

simoncozens commented 4 months ago

Hi Mark, so to answer the question we have to think about what Noto is - it is a family of fonts, intended for use in a fallback stack, which covers all of Unicode. If your font fallback stack is set up properly, then if a character is in any Noto font, it's "in Noto" and will be displayed correctly. The NotoVerse tells you which fonts cover each codepoint; for 201F, it's covered in Noto Sans/Serif (Latin/Greek/Cyrillic). So if your font stack is "Noto Sans Hebrew, Noto Sans" then everything will work.

We try to avoid duplicating glyphs into other fonts if they are already covered, unless there is some specific interaction which we want to achieve. In this case, I don't think we need to do anything clever between U+201F and any Hebrew glyph, so it would not make sense to copy U+201F into the Hebrew fonts.

markhdavid commented 4 months ago

If your font fallback stack is set up properly, then if a character is in any Noto font, it's "in Noto" and will be displayed correctly. The NotoVerse tells you which fonts cover each codepoint; for 201F, it's covered in Noto Sans/Serif (Latin/Greek/Cyrillic).

Hi, thanks @simoncozens for that explanation, but how do I (or any users) tell what the correct font stack is for each of the 3 Noto Hebrew fonts, Sans, Serif, and Rashi? Is this specified somewhere, formally or otherwise? Maybe Sans seems obvious, but there are two Serifs: "Serif" and "Serif Display". What about for Rashi? Thank you.

markhdavid commented 4 months ago

Update: I initially tested with ttf fonts generated directly from Glyphs App. Today I tested with ttf fonts generated via the official make build command. This results in a lot more characters to be present. The upshot is that in the current fonts, just as in the publicly released-on-Google Fonts fonts, there is one particular double-quote character (U+201F [‟] DOUBLE_HIGH-REVERSED-9_QUOTATION_MARK) missing. I want to flag this as an issue. It's producing "tofu" for one particular double-quote character, and for this font it is inconsistent and arbitrary, and it very adversely affects one language (Yiddish) for publishing at least one major online newspaper (forverts.com/yiddish/). Here's sample text and codes therein and how the text looks generated in hb-view with NotoSansHebrew-Regular.ttf as built on current HEAD:

U+201D [”] RIGHT_DOUBLE_QUOTATION_MARK U+201E [„] DOUBLE_LOW-9_QUOTATION_MARK U+05D1 [ב] HEBREW_LETTER_BET U+05DC [ל] HEBREW_LETTER_LAMED U+05D0 [א] HEBREW_LETTER_ALEF U+05B7 [ַ] HEBREW_POINT_PATAH U+201F [‟] DOUBLE_HIGH-REVERSED-9_QUOTATION_MARK U+201C [“] LEFT_DOUBLE_QUOTATION_MARK

Screenshot: 2024-03-08-NotoSerif-Regular-dqs-BUILT

Test text file: test-double-quotes.txt: test-double-quotes.txt

hb-view command to generated (run in fonts/NotoSansHebrew/full/ttf/ after make build):

hb-view --font-file NotoSansHebrew-Regular.ttf --text-file "test-double-quotes.txt" --output-file "2024-03-08-NotoSerif-Regular-dqs-BUILT.png"

simoncozens commented 4 months ago

I want to flag this as an issue.

I'm not sure it is an issue. For Google Fonts we have a certain expected set of "Latin core" characters that must be included in any font. That's why the googlefonts/ build includes more characters. But U+201F is not in Latin Core even though U+201E is. (Again, in a fallback stack you will be using Noto Hebrew alongside Noto Serif and Sans, so this character will still work fine.)

I don't know the rationale behind including U+201E not but U+201F, but I can see the definition here. Maybe it's just that the "reversed" forms are much more rarely used. If you want to make the case that both should be included (or ask why) then the right place is the glyphset repository.

It's producing "tofu" for one particular double-quote character, and for this font it is inconsistent and arbitrary, and it very adversely affects one language (Yiddish) for publishing at least one major online newspaper (forverts.com/yiddish/).

I'm sorry to put it this way, but they ain't got their CSS set up correctly.

how do I (or any users) tell what the correct font stack is for each of the 3 Noto Hebrew fonts, Sans, Serif, and Rashi? Is this specified somewhere, formally or otherwise? Maybe Sans seems obvious, but there are two Serifs: "Serif" and "Serif Display". What about for Rashi?

Pairing fonts is a matter of taste, but we do give hints - sources/config-rashi-hebrew.yaml says:

buildVariable: true
familyName: Noto Rashi Hebrew
category:
- SERIF
googleFonts: true
includeSubsets:
- from: Noto Serif
  name: GF_Latin_Core
sources:
- NotoRashiHebrew.glyphs

i.e. "this is a serif style and when we build the full andgooglefonts builds, we include the characters from Noto Serif."