metanorma / bipm-si-brochure

SI Brochure edition 9, semantic encoded version (WARNING: DRAFT)
3 stars 0 forks source link

BIPM review: sans-serif fonts shown as serif fonts for dimension symbols #121

Closed manuelfuenmayor closed 3 years ago

manuelfuenmayor commented 3 years ago

In relation to https://github.com/metanorma/bipm-si-brochure/issues/112

Original:

Screenshot 2021-06-17 at 9 44 25 PM

Rendered:

issue3

stem:[sf "I"] and stem:[sf "J"] are showing up with serif.

ronaldtse commented 3 years ago

I will seek clarification from BIPM. (UPDATE: Let's try a better solution as stated below)

In any case, we do not want stem:[sf "I"] but with UnitsML encoding.

ronaldtse commented 3 years ago

Additional information.

@Intelligent2013 says we are using STIX Two Math for these dimension symbols, with the character map here: https://github.com/stipub/stixfonts/blob/master/docs/STIXTwoMath-Regular.pdf

I noticed that this code page contains "I" and "J" without ligatures.

Screenshot 2021-06-17 at 9 12 27 PM

@Intelligent2013 says:

In the source adoc: | ((electric current)) | stem:[ii(I), i] | stem:[sf "I"] I - has code 49. But looks like we need to use 1D5A8. I'll check it now.

Second line was encoded as: | ((electric current)) | stem:[ii(I), i] | stem:[sf "đť–¨"] I - is 1D5A8 But display identical. I have to deep into the issue, looks like it's a bug in jEuclid or something else.

image

Hopefully we can fix this. I also wonder if we can encode such a symbol for UnitsML? Ping @opoudjis .

ronaldtse commented 3 years ago

Anyway the point is that the unicode points 1D5A8 and 1D5A9 are correct. The display aberration is a font-specific issue.

https://www.compart.com/en/unicode/block/U+1D400

Screenshot 2021-06-17 at 9 38 25 PM
Intelligent2013 commented 3 years ago

Some investigation about how jEuclid works. In jEuclid there are a list a predefined fonts for 'sans-serif', 'serif', 'monospaced', 'script', 'fraktur' and 'doublestruck'. Example for sans-serif:

        final List<String> fontsSanserif = new ArrayList<String>(12);
        fontsSanserif.add("Verdana");
        fontsSanserif.add("Helvetica");
        fontsSanserif.add("Arial");
        fontsSanserif.add("Arial Unicode MS");
        fontsSanserif.add("Lucida Sans Unicode");
        fontsSanserif.add("Lucida Sans");
        fontsSanserif.add("Lucida Grande");
        fontsSanserif.add("DejaVu Sans");
        fontsSanserif.add("DejaVuSans");
        fontsSanserif.add("Bitstream Vera Sans");
        fontsSanserif.add("Luxi Sans");
        fontsSanserif.add("FreeSans");
        fontsSanserif.add("sansserif");

When we specify mathvariant="sans-serif":

<mstyle mathvariant="sans-serif">
  <mtext>I</mtext>
</mstyle>

it means, that first found font from the list will be used for displaying text inside <mstyle mathvariant="sans-serif">. But not main font 'STIX Two Math`. We've modified similar list for "script" here https://github.com/metanorma/mn-native-pdf/issues/289. May be we need to move 'Arial' font at first place...

But I can't figure out yet why char '1D5A8' displays identical to 49. Will investigate...

ronaldtse commented 3 years ago

@Intelligent2013 we need to ensure STIX Two Math is really used as the first font for jEuclid...

ronaldtse commented 3 years ago

In any case, I have sought BIPM clarification on what code points they prefer for the dimension symbols.

Intelligent2013 commented 3 years ago

From jEuclid sources (jeuclid\jeuclid-core\src\main\resources\net\sourceforge\jeuclid\UnicodeData.txt):

1D5A8;MATHEMATICAL SANS-SERIF CAPITAL I;Lu;0;L;<font> 0049;;;;N;;;;;

jeuclid\jeuclid-core\src\main\java\net\sourceforge\jeuclid\elements\support\text\CharacterMapping.java:

...
private static final int HIGHPLANE_MATH_CHARS_START = 0x1D400;
...
final boolean force = (codepoint >= CharacterMapping.HIGHPLANE_MATH_CHARS_START)
        && ((FontFamily.SANSSERIF.equals(fam)) || (FontFamily.SERIF
                .equals(fam)));
if (force) {
    this.forceSet.add(codepoint);
}

final CodePointAndVariant cpav = new CodePointAndVariant(mapsTo,
        new MathVariant(awtStyle, fam));
this.extractAttrs.put(codepoint, cpav);
final Map<Integer, Integer[]> ffmap = this.getFFMap(fam);
final Integer[] ia = this.getMapsTo(mapsTo, ffmap);
ia[awtStyle] = codepoint;
...

It means that character '1D5A8' (> 0x1D400) displays as 49 but with sans-serif font.

Intelligent2013 commented 3 years ago

As workaround solution we can use:

<math>
 <mglyph fontfamily="STIX Two Math" index="120232"/> 
</math>

then resulted PDF: image ... image

Intelligent2013 commented 3 years ago

Or turn-off codepoint checking in the module CharacterMapping.java. But I don't know about side-effect.

ronaldtse commented 3 years ago

@Intelligent2013 jEuclid is probably doing that for char compatibility, in case the font doesn't support a particular code point. With the current situation, It shouldn't replace the code points when the font supports that code point.

ronaldtse commented 3 years ago

Experimentation was done at https://github.com/metanorma/bipm-si-brochure/issues/144 , and BIPM has provided their feedback.

From Michael Stock of BIPM:

Although the two references which you cited, ISO 80000-1 and BIPM JCGM 200:2012 do not mention anything about bolding of symbols for dimensions, it is quite unusual to use bold letters as symbols for dimensions. In fact, bold letters are usually used for matrices.

I would prefer your last table, for the reason that visually it gives the best result. In my opinion the risk that someone would copy the sans serif theta from this table and get a serif theta is acceptable.

The symbols that BIPM wants to use is provided in the last table, i.e.:

[%unnumbered]
[[table3_4]]
.all "Dimensions" symbols use Latin “plain” capital characters, except theta uses Greek “normal” capital character (U+03F4)
[cols="<,<,<"]
|===
| Base quantity | Typical symbol for quantity | Symbol for dimension

| time | stem:[t] | stem:[sf "&#x1D5B3;"]
| length | stem:[l, x, r], etc. | stem:[sf "&#x1D5AB;"]
| mass | stem:[m] | stem:[sf "&#x1D5AC;"]
| electric current | stem:[ii(I), i] | stem:[sf "&#x1D5A8;"]
| thermodynamic temperature | stem:[ii(T)] | stem:[sf "&#x03F4;"]
| amount of substance | stem:[n] | stem:[sf "&#x1D5AD;"]
| luminous intensity | stem:[ii(I)_("v")] | stem:[sf "&#x1D5A9;"]
|===

@Intelligent2013 can you help update the ADoc source and ensure the output is correct? Thanks!

Intelligent2013 commented 3 years ago

Done. En: image Fr: image

ronaldtse commented 3 years ago

Thanks!