mathjax / MathJax

Beautiful and accessible math in all browsers
http://www.mathjax.org/
Apache License 2.0
10.12k stars 1.16k forks source link

Add support for new Mathematical Web Fonts #489

Closed fred-wang closed 10 years ago

pkra commented 11 years ago

@fred-wang Where should I leave bug reports for your webfont branch?

pkra commented 11 years ago

@fred-wang here's a bug with jquery-mobile and your open-type branch creating too much whitespace after a math expression http://jsfiddle.net/8AmED/. This doesn't happen with v2.2.

fred-wang commented 11 years ago

On Chrome the TeX source is not parsed and on Firefox I see image fonts... is it a security restriction or something? Also, note that the branch is not based on the final version of v2.2 but on an earlier version of Davide's develop branch (before localization changes) so some fixes may not have been integrated yet.

pkra commented 11 years ago

Ok, didn't know that branch wasn't up-to-date. Will you merge v2.2 in anytime soon?

I didn't see any problem on Chrome (it as slow, but not too bad); Firefox will get stuck at missing CORS headers from github. I can zip it up or you can just download jquery-mobile ;).

dpvc commented 11 years ago

I've gone through the fontdata file for STIX-web pretty carefully, and here is what I've found:

Those are the things that I see at the moment. I have not looked closely at the other web font data, but will do so.

dpvc commented 11 years ago

I found the problem with the centering of the large operators: the SVG code was using the wrong sign (but since the MathJax TeX fonts have these centered already, it didn't matter). There is also a change needed to munderover to handle the shifted position properly.

I also found the problem with the changes to the integral spacing: The SVG code doesn't use the rfix value, and the 5th item is actually the path data, so you were wiping out the path by setting it. Because the SVG output jax doesn't use the system's font code (it places the paths itself), you just have to modify the bounding box data (there is no need to compensate for the actual font size, since that is never used). The sizes you used also were not always large enough, so I've adjusted them.

Finally, the problem with the space in the monospace font is that the monospace fonts doesn't include the nonbreaking space (U+00A0), and so that was being taken from the Main font instead, which is of the wrong size. This could be fixed by adding the U+00A0 character to the monospaced font (probably a good idea), but for now, I've added a remapping of U+00A0 to U+0020 in the fontdata file.

The changes I made are in the open-type-fonts branch of my fork of MathJax. These are the differences.

fred-wang commented 11 years ago

OK thanks for looking into this. I think you can extract the changes to the SVG output jax and merge them into the develop branch (I'm not sure we really want the changes to avoid infinite loop when DELIMITERS is incorrect). The changes to the fontdata should be integrated in the fontdata-adjust.js files of the Python scripts that generate them. I'll look into this next Monday.

dpvc commented 11 years ago

Right. I only committed them to this branch in order for you to be able to see them. I didn't take time to figure out where the changes had to be made in your font-generation code.

I'll move the changes to the SVG output jax to the develop branch.

fred-wang commented 11 years ago

There are references in the DELIMITERS array to Size4 characters U+23AA, U+23B0, and U+23B1

These are not in the Open Type Math table but were in the old STIX fonts, so I've added them by hand: https://github.com/fred-wang/MathJax-dev/blob/open-type-fonts/fonts/OpenTypeMath/STIX-Web/config.py#L72 (it seems to be the only extra stretchy ops whose pieces are non-unicode chars, so that may be a bug in the Python script)

-STIX-variant should be -STIX-Web-variant

Fixed.

There is no need for TTF versions in the fonts/HTML-CSS directories

OK, I've removed the SVG and TTF fonts.

The variants are not being handled properly. looks like the extra 3 MathJax glyphs are not in the NonUnicode fonts

I didn't really take care of the NonUnicode fonts, currently they are basically just the "remaining" glyphs once we have moved all the others in smaller fonts. I'm wondering if the important characters that MathJax really uses should just be moved in existing/new fonts and the NonUnicode fonts be dropped. Or if some users are willing to use these characters anyway and so the NonUnicode should be provided with the appropriate CMAP...

The DELIMITER data for U+20D0, U+20D1,

I think these are from the Open Type Math table. Currently the script reads the table and moves everything in fontdata.js. Some config options could probably be added to tell whether some glyphs should go in extra or not.

In the current setup, the script, fraktur, double-struck, and other mathvariants

I think we should discuss again how the mathvariants/style are handled. I'd like to have something consistent between MathJax/browsers/MathML spec and between the various fonts ; that requires less workarounds and allows the Python script to handle the fonts uniformly (unfortunately, this ideal case is probably not possible but we can try and find the best situation)

There are a couple of delimiters that could be made infinitely stretchy

I think these are from the Open Type Math table and the script does not currently change/complete the values provided by the font designer. Again, a config option could be added to customized. Probably the changes should be reported to the STIX consortium.

Finally, the problem with the space in the monospace font

I've modified the script so that the monospace fonts have both the 0x20 and 0xA0 glyphs.

I also found the problem with the changes to the integral spacing

I've updated the fontdata-adjust.js files to import your changes.

fred-wang commented 11 years ago

There are references in the DELIMITERS array to Size4 characters U+23AA, U+23B0, and U+23B1, but these are no in that font.

OK, the glyphs were correctly copied in the Size5 of PUA but the data were not generated correctly (it used Size4 and didn't print the PUA code point). I've fixed that so that should work now. Note that U+23AA is no longer stretchy by default, though.

dpvc commented 11 years ago

I'm not sure why these three characters are being put in PUA when they have assigned Unicode values (U+23AA, U+23B0, and U+23B1). Why aren't they placed at those positions? And why is Size5 the place for them, when they are really at the Size1 (or Main) size? Why don't they go into one of those fonts?

dpvc commented 11 years ago

The variants are not being handled properly. looks like the extra 3 MathJax glyphs are not in the NonUnicode fonts

I didn't really take care of the NonUnicode fonts, currently they are basically just the "remaining" glyphs once we have moved all the others in smaller fonts. I'm wondering if the important characters that MathJax really uses should just be moved in existing/new fonts and the NonUnicode fonts be dropped. Or if some users are willing to use these characters anyway and so the NonUnicode should be provided with the appropriate CMAP...

I'm fine with having the used characters moved to other fonts. But I wouldn't want to drop the others entirely. Someone is bound to say "Character XX is in the STIX fonts, but I can't find it in MathJax." If these not-often-used characters are in a separate font, then it should not hurt MathJax to have the data for them, as it will rarely be loaded, but will be there if someone needs it.

The DELIMITER data for U+20D0, U+20D1,

I think these are from the Open Type Math table. Currently the script reads the table and moves everything in fontdata.js. Some config options could probably be added to tell whether some glyphs should go in extra or not.

Don't you already have that here? It looks like they could just be added to this list, no?

There are a couple of delimiters that could be made infinitely stretchy

I think these are from the Open Type Math table and the script does not currently change/complete the values provided by the font designer. Again, a config option could be added to customized. Probably the changes should be reported to the STIX consortium.

The suggested changes are not critical.

fred-wang commented 11 years ago

I'm not sure why these three characters are being put in PUA when they have assigned Unicode values (U+23AA, U+23B0, and U+23B1). Why aren't they placed at those positions? And why is Size5 the place for them, when they are really at the Size1 (or Main) size? Why don't they go into one of those fonts?

Well, that's a very special case that is not handled "normally". The script copies the non-Unicode glyphs to the Size5 PUA, so I've kept the same procedure. These glyphs are also used as components so they will be copied to the Size5 PUA anyway. Again the solution would be to report that to the STIX consortium, so that they are directly specified in the Open Type Math table.

I'm fine with having the used characters moved to other fonts. But I wouldn't want to drop the others entirely. Someone is bound to say "Character XX is in the STIX fonts, but I can't find it in MathJax." If these not-often-used characters are in a separate font, then it should not hurt MathJax to have the data for them, as it will rarely be loaded, but will be there if someone needs it.

Currently they are in non-Unicode position, so not accessible to Web authors by normal means anyway. I agree with having those used by MathJax in specific fonts and the remaining not-often-used characters in a separate font.

Don't you already have that here? It looks like they could just be added to this list, no?

I think this is only for custom stretchy operators (If I remember correctly), I have to check if that works for those in the Open Type Math font or otherwise teach the script to handle that case.

dpvc commented 11 years ago

These glyphs are also used as components so they will be copied to the Size5 PUA anyway.

Even if they already exist at another location in another font? Is that required? For example, if the parenthesis pieces are moved to their correct locations in the Misc. Technical block, will you still be moving them to Size5 anyway?

Currently they are in non-Unicode position, so not accessible to Web authors by normal means anyway.

I guess you are referring to the Word version of the files, but the STIXGeneral versions had these in the PUA at U+E000 through U+E3CF, and these do have CMAP entries and are accessible through normal means. (Indeed, that is how MathJax can access them.)

fred-wang commented 11 years ago

Even if they already exist at another location in another font? Is that required? For example, if the parenthesis pieces are moved to their correct locations in the Misc. Technical block, will you still be moving them to Size5 anyway?

If the parenthesis pieces are moved to their correct Unicode locations, then they won't be in Size5 but directly accessed from the MISC font (for both HW and stretch) as it's currently the case, for example, for 0x27F0.

I guess you are referring to the Word version of the files

Yes, I was referring to STIX-Word.

fred-wang commented 11 years ago

Don't you already have that here? It looks like they could just be added to this list, no?

I stand corrected, the script is already able to handle that. I've updated the list.

fred-wang commented 11 years ago

I've update the MathJax-dev branch (not the MathJax branch, though)

It looks like the extra 3 MathJax glyphs are not in the NonUnicode fonts, so they aren't being detected properly (MathJax loads them, but times out trying to detect them).

I fixed that, but we'll need to decide how to split this font.

There are a couple of delimiters that could be made infinitely stretchy (but they have only fixed size data):

I've added them.

fred-wang commented 11 years ago

The variants are not being handled properly

Davide, do you have a list of characters that require a variant? I see they are in STIXVariant and STIXNonUnicode from the STIXGeneral set, but I don't know which are precisely used by MathJax.

fred-wang commented 11 years ago

(and additionally, that would help to know how I can check the variants are correctly picked)

dpvc commented 11 years ago

Here is the list of variants produced by the TeX input jax, together with the tex macros that produce them.

U+2032  \prime
U+2033  x''
U+2034  x'''
U+2035  \backprime
U+2036
U+2037
U+2057  x''''
U+210F  \hbar
U+21CC  \rightleftharpoons
U+2205  \varnothing
U+2216  \backslash
U+2216  \setminus
U+2223  \shortmid
U+2224  \nshortmid
U+2225  \shortparallel
U+2226  \nshortparallel
U+223C  \thicksim
U+2248  \thickapprox
U+2268  \lvertneqq
U+2269  \gvertneqq
U+2270  \nleqq
U+2271  \ngeqq
U+2288  \nsubseteqq
U+2289  \nsupseteqq
U+228A  \varsubsetneq
U+228B  \varsupsetneq
U+22E0  \npreceq
U+22E1  \nsucceq
U+2322  \smallfrown
U+2323  \smallsmile
U+25B3  \vartriangle
U+25BD  \triangledown
U+2A87  \nleqslant
U+2A88  \ngeqslant
U+2ACB  \varsubsetneqq
U+2ACC  \varsupsetneqq

One way to test them would be to see if they produce different symbols from the plain character (as produced by \unicode{xNNNN} in TeX, or by a regular entity reference in MathML).

fred-wang commented 11 years ago

OK, thanks. I've done a quick check and it seems that this does not correspond exactly to those available in STIX (some are not available, others are not used in MathJax). I guess I'll just move all the *.var glyphs into a separate Variant font as you suggested. I'll try to do the same that for the other fonts too.

FYI and if we want to consider that in the future, here is however a testcase to access them via font-feature-settings. That seems to work for me in Firefox and Chrome:

     <p style="-moz-font-feature-settings: 'ss03'; -webkit-font-feature-settings: 'ss03'; font-family: STIX;">w&#x007C;&#x019B;&#x0264;&#x2032;&#x2033;&#x2034;&#x2035;&#x2036;&#x2037;&#x2057;&#x210F;&#x2140;&#x2190;&#x2191;&#x2192;&#x2193;&#x21D1;&#x21D3;&#x21E0;&#x21E2;&#x2205;&#x2208;&#x2209;&#x220B;&#x220C;&#x220F;&#x2210;&#x2211;&#x2216;&#x221A;&#x221B;&#x221C;&#x221D;&#x2223;&#x2224;&#x2225;&#x2226;&#x2229;&#x222A;&#x223C;&#x223E;&#x223F;&#x2248;&#x224C;&#x2272;&#x2273;&#x2293;&#x2294;&#x2295;&#x2297;&#x229C;&#x22DA;&#x22DB;&#x2322;&#x2323;&#x2423;&#x25A9;&#x2A3C;&#x2A3D;&#x2A9D;&#x2A9E;&#x2AAC;&#x2AAD;&#x2ACB;&#x2ACC;&#x2AEE;</p>
fred-wang commented 11 years ago

I've updated my MathJax-dev branch. I found the variants in STIX-Word for most of these characters, except U+21CC \rightleftharpoons. The difference is not obvious for me with the local STIX fonts. I don't really see the difference for \thicksim and \thickapprox either (with the MathJax TeX fonts I have to zoom in to see that they are thicker).

dpvc commented 11 years ago

It may be that the STIX fonts don't have those variants (some of the variants were only in the TeX fonts, and I think \thicksim was one, so \thickapprox probably is as well).

fred-wang commented 11 years ago

I think STIX-Word has the \thickapprox and \thicksim variants, it's just that the visual difference is not obvious. The only one I could not found was \rightleftharpoons (but it was in STIX-General I think)

dpvc commented 11 years ago

It turns out that the \rightleftharpoons variant is in the MathJax TeX fonts, not the STIX fonts, so that is why it appeared in the list above. I don't see the difference with the \thicksim and \thickapprox either (though I do see that the variant character is available in STIXVariants, but it is not different from the original glyph). Again, these two are really there for the MathJax fonts, which does have a different glyph.

fred-wang commented 11 years ago

OK I'm pushing an updated version to the branch.

fred-wang commented 11 years ago

So I just had a look at the variants for the other fonts. It seems that the most important are prime symbols, since the default size is too small, except for Asana. The others (Gyre-*, Latin Modern, Neo Euler) have larger variants for the prime symbols, so I've added them. It seems that there are other variants but that does not really correspond to MathJax's need.

fred-wang commented 10 years ago

I've create a clean branch issue489. The only difference with the previous open-type-fonts should be the addition of a few variants for the non-STIX fonts.

I'll send a pull request.

dpvc commented 10 years ago

OK, I've merged this, and am closing this issue. Start new tracker issues for any new problems that are discovered.

=> Merged.

fred-wang commented 10 years ago

I'm not sure if that happened before and is now only visible now that the testing framework accept arbitrary fonts. But if you pass an invalid font, it seems that MathJax won't fallback to what was introduced in the #363 and *.FONTDATA may be null. See:

http://devel.mathjax.org/testing/testsuite/API/Hub/setRenderer-1.html?&mathJaxPath=http://devel.mathjax.org/mathjax/MathJax-develop/unpacked/&font=STIX&outputJax=HTML-CSS (will raise an error in the SVG that does not support STIX local) http://devel.mathjax.org/testing/testsuite/API/Hub/setRenderer-1.html?&mathJaxPath=http://devel.mathjax.org/mathjax/MathJax-develop/unpacked/&font=xxx&outputJax=HTML-CSS (will raise an error in the HTML-CSS)

fred-wang commented 10 years ago

OK, this is not a regression since it happened with MathJax v2.2 too: http://devel.mathjax.org/testing/testsuite/API/Hub/setRenderer-1.html?&mathJaxPath=http://cdn.mathjax.org/mathjax/latest/unpacked/&font=STIX&outputJax=HTML-CSS