mathjax / MathJax-src

MathJax source code for version 3 and beyond
https://www.mathjax.org/
Apache License 2.0
2.07k stars 207 forks source link

Missing characters #389

Closed jacobp100 closed 4 years ago

jacobp100 commented 4 years ago

I noticed that the mu and Angstrom symbols are missing from the normal font (there could be more too). In the mml to svg example, it uses a text element, so doesn't look completely broken. However, I'm using React Native, and text elements do not work in my setup

Is excluding these characters intentional? If so, I reckon I could just monkey patch the characters I need. If not, I'm happy to submit a PR with these characters

In either case, do you have any pointers for how you generated the bounding boxes along with svg path data for the characters? I noticed the mathjax-dev repo, but couldn't get any of that to work

Thanks!

dpvc commented 4 years ago

The MathJax fonts do not include every possible character, as you have found. They are fairly minimal to make them small for delivery over the web, and are based on the original TeX Computer Modern fonts. TeX produces the Angstrom symbol by adding a ring accent to an "A", and MathJax does that as well, but that makes it harder to do in MathML input. MathJax's basic fonts include italic greek letters like mu, but not upright ones, and those are in the Greek and Coptic unicode block, not at U+00B5 (micron), where you may be trying to obtain it.

One of the next phases of the v3 conversion is to rebuild the fonts, and to make better font tools available so that page authors can add individual characters as, or collections of characters, as needed. Unfortunately, those tools aren't yet ready, and won't be until next year.

It is possible to use a MathML input jax prefilter to convert your input to make <mi>Å</mi> to the MathML needed to make the accented "A", and to convert U+00B5 to U+03BC before processing the MathML. That might do what you need. Here is an example:

<script>
MathJax = {

  angstrom: [
    '<mover>',
      '<mi mathvariant="normal"$1>A</mi>',
      '<mpadded voffset="-.07em" height="-.07emn">',
        '<mo$1>\u02DA</mo>',
      '</mpadded>',
    '</mover>'
  ].join(''),

  mml: {
      forceReparse: true
  },
  startup: {
    pageReady() {
      MathJax.startup.document.inputJax[0].preFilters.add((data) => {
        data.data = data.data.replace(/<mi([^>]*)>\u212B<\/mi>/g, MathJax.config.angstrom).replace(/\u00B5/g, '\u03BC');
      });
      MathJax.startup.defaultPageReady();
    }
  }
};
</script>
<script id="MathJax-script" async src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/mml-svg.js"></script>

This looks through your MathML input replaces <mi>Å</mi> with the <mover> construction needed by MathJax (while trying to preserve any attributes that are on the <mi>, and replaces all occurrences of U+00B5 with U+03BC. The mu will be italic, but that is the only form that MathJax has in its fonts. Adding new paths and bounding-box data for these characters could be done, but it would take some work to do it, as the needed tools aren't currently available for automating it.

jacobp100 commented 4 years ago

Thanks for the detailed response!

I tried playing around with generating these with the Computer Modern typeface and opentype.js. I only really looked at the character 'A' to try and replicate that

I noticed comparing CM to the path in the roman font here that your font seems weightier - but the bold CM font was too bold

Do you know how these fonts were generated? I'm wondering if I can just generate the missing chars I need and monkey patch them in

Or alternatively, I noticed that the font representation structure isn't too complicated, each character looked like a 3 numbers along with the path data. I know exactly what characters I need, and I'm only targeting SVG

What I'm considering is just generating my own font tables off the CM font, and then monkey patching them (probably with some webpack trickery). Is there anything obviously bad with this idea? Some of the characters may change in dimensions, and I noticed the dimensions come from a common folder, I'm not sure if any code references those dimensions directly. Also, I'm not entirely sure how large operators and brackets will behave

If I go ahead with this experiment, I'll be happy to share what I made! 😄

jacobp100 commented 4 years ago

To give an update on this, I booted up my old laptop that runs linux, and managed to get MathJax-dev working. Spend quite a lot of time trying to get Computer Modern Roman to export the upright Greek letters, and after much time, discovered there aren't upright Greek characters in that font 😅

This at least explains why they aren't included in MathJax - so this issue can be closed

For posterity though, I looked at some the packages upgreek and siunitx. The latter uses a different font for upright Greeks, so pretty much a no-go

Upgreek seems to have characters that as far as I can tell were not made via Metafont, so can't go though the same processing that happens MathJax-dev. These Greeks look to be included in the OTF font Computer Modern Unicode, and that looks to be an easier way to obtain them. At the moment, I'm playing around with FontForge's glyph processing to make these characters match a bit closer to the processed fonts MJ uses, and will update here when I do

jacobp100 commented 4 years ago

The update to this: even the CMU font used a different font (cbgreek). But digging further on Stack Overflow, one of the answers skewed the italic font to make the upright versions. If someone wanted to do this, you can take the path data, and skew it via,

const skewX = (char, deg) => {
  const c = Math.atan((deg * 2 * Math.PI) / 360);
  const data = char.slice();
  const p = data
    .pop()
    .p.replace(
      /(-?\d+)\s+(-?\d+)/g,
      (_, x, y) => `${(+x + +y * c).toFixed(0)} ${y}`
    );
  data.push({ p });
  return data;
};

So to do this for all the greek letters

for (let i = 0x3b1; i <= 0x3c9; i += 1) {
  normal[i] = skewX(italic[i], -15);
  bold[i] = skewX(boldItalic[i], -15);
}