Eelis / cxxdraft-htmlgen

Generates https://eel.is/c++draft
Other
165 stars 12 forks source link

Courier New does not render U+0301 correctly #103

Closed timsong-cpp closed 1 year ago

timsong-cpp commented 1 year ago

This affects s9 in https://eel.is/c++draft/format.string.escaped#example-1: image

The accent should be on the e, not the quote, like:

image

Eelis commented 1 year ago

What browser are you using? It looks ok on my machine with both Chromium (111.0.5560.0) and Firefox (116.0.3): image

timsong-cpp commented 1 year ago

Chrome 116.0.5845.111 on Windows 10. The whole line looks like

image

This is clearly using a different font - your screenshot has a dot inside 0 and no serifs in u. Looking at the sample on Wikipedia I don't think yours is actually using Courier New?

Eelis commented 1 year ago

Ah, I indeed didn't have Courier New installed, and so it used Liberation Mono instead.

I've now installed the MS Fonts, but the result still looks ok: image

timsong-cpp commented 1 year ago

Different font version perhaps? This is what I see in Windows font settings with the text copied in, so it definitely looks font-related:

image

Eelis commented 1 year ago

font-manager shows:

image

Eelis commented 1 year ago

Perhaps it would be useful to file a support ticket with Microsoft, to ask if there's anything they can do to make Courier New work well on their platform?

timsong-cpp commented 1 year ago

I see the same issue in both Chrome and Safari on my iPhone too, so it’s not just MS.

Eelis commented 1 year ago

I see, interesting. I just tried with Firefox and Chrome on my Android phone, and both rendered the character correctly.

Eelis commented 1 year ago

Purely speculative hypothesis: maybe the combining characters for accents are not in Courier New proper, but Linux cleverly uses the missing characters from another font, while Windows/Mac lack this sophistication?

timsong-cpp commented 1 year ago

It's definitely the font: copying the 6.92 font to WSL breaks rendering in Chrome-in-WSL, while the 2.82 font renders as desired in https://fontdrop.info/ in Chrome-on-Windows. Perhaps the 2.82 version is so old that it led to some sort of default or compatibility handling that ends up being better for this particular case?

Also, both versions are not very monospace-y for Cyrillic (see the s2 example). It might be worthwhile to consider using a different font if the only version that semi-works is an ancient version not installed anywhere by default.

Eelis commented 1 year ago

Ah, thanks for the investigation. Good idea, I've switched to Roboto Mono now.

jengelh commented 6 months ago

Between the last comment and now, Roboto was switched out for Noto. Along that timeline, the problem with a misplaced dot reappeared. Today, it looks like so in Firefox/Linux:

1

Remember that 'ẹ́' is comprised of U+0065,U+0301,U+0323. In css2_002.css, we find that Noto is redefined from piecewise sub-files:

/* cyrillic */
@font-face {
  font-family: 'Noto Sans Mono';
 ...
  unicode-range: U+0301, U+0400-045F, U+0490-0491, U+04B0-04B1, U+2116;
}
/* vietnamese */
@font-face {
  font-family: 'Noto Sans Mono';
 ...
  unicode-range: U+0102-0103, U+0110-0111, U+0128-0129, U+0168-0169, U+01A0-01A1, U+01AF-01B0, U+0300-0301, U+0303-0304, U+0308-0309, U+0323, U+0329, U+1EA0-1E
}

[Edit: U+0301 is provided by two files. That alone seems problematic, as you can no longer which one wins.]

if I comment out the cyrillic block, the dot moves. My hypothesis is that the stitching done with unicode-range may cause the combination of a "latin"-source U+0065 with a "cyrillic"-sourced U+0301, and causing the page renderer to ignore all the kerning pairs in both "latin" and "cyrillic" sources involving U+0301 (because it's two separate fonts at this point!), thus misplacing the dot.

The remedy, I think, is to not use woff fragments anymore and replace it by downloading a single (large) font file that covers all scripts at once. This would be a somewhat significant change; https://eel.is/c++draft/format.string.escaped#example-1 currently downloads 34KB of font fragments, but NotoMono as a single entity would weigh in at ~207KB (when woff-ed).

(A PR is not deemed necessary currently, since we're evaluating other font choices anyway.)