Closed amanda-phet closed 5 years ago
PhET does not currently have a common repo for "math symbols".
"griddle" is the PhET repository for graphs, see https://github.com/phetsims/griddle.
It would be helpful to know how flexible this needs to be. Do we need to handle arbitrary formulas at run-time, or could we enumerate every formula at build-time?
MathJax looks like a pain to try to fit into sims, but it may be possible (tested on IE11, and it is NOT using images anymore, but is manually positioning HTML elements with a custom web font). This means it may be possible to integrate, but I'd anticipate a minimum of 40 hours.
MathML output would be great for some browsers, but isn't supported by others (see http://caniuse.com/#feat=mathml).
It may be possible to use MathJax during build time to lay-out some specific formulas, or use a regular LaTeX processor to convert formulas to images (if that is an option for sims).
Also, Curve Fitting includes rasterized LaTeX output images, and would benefit from an improved approach.
Katex is an alternative that should be explored: http://khan.github.io/KaTeX/
Another question that will be critical to know from designers: will all of these math things be static, or will objects change size/value as the user interacts?
The math text will often need to change as the user interacts.
The math text will often need to change as the user interacts.
@amanda-phet, that doesn't present a problem if it's changing between ~20 options that we can embed at build-time, thus I was wondering if we could enumerate all possible options.
Katex is an alternative that should be explored
This definitely looks promising, +1!
After a brief bit of investigation, I can see multiple ways we could use KaTeX. It will add ~150kB to the sims that use it, but it would probably be by far the cleanest approach.
@ariel-phet, let me know if/when I should spent time looking into this more.
@jonathanolson - soon(ish) probably after or concurrent with pendulum lab work.
@jonathanolson - since you were excited to investigate this possibility, perhaps work in a few hours of investigation over the next week or two and report in this issue what you find and an estimate for full inclusion of this feature.
While investigating... Keep in mind that interactive expressions has been (and will be) a requirement of some PhET sims. See example below from Graphing Lines. It would be nice if the layout code for such equations was capable of handling both interactive and static elements. Or rendering fragments of expressions that can be combined with interactive components.
Examples like the one above in https://github.com/phetsims/scenery/issues/457#issuecomment-135838479 suggest to me that we will benefit the most from the ability to render and layout individual glyphs as scenery nodes, so that we have full control over them and can provide interactivity. It may be that the best solution is for us to build something from scratch ourselves.
@samreid - it is true that we may need something built ourselves to handle many cases, but there are already multiple places (such as in trig-tour) where just being able to write out static content/labels in "good looking" math would be incredibly useful.
For static content/labels, images may be the best solution. They are lightweight, we have ultimate control over exactly how things are layed out and look (we can use latex for example), and we can mipmap them, they integrate into scenery layering (no need for dom layers or adapters) and there is no additional code library required or time required to generate the graphics at runtime. Even if there are a few cases where images won't work, we will likely to be able to use images in many places. What we should discuss now is what features we need that we cannot get from images.
I'd like to also point out a hybrid approach would be available for many browsers, where it can render the DOM into an image. We could then fall back to using a DOM layer.
It looks like KaTeX is working excellently. It pretty much works out-of-the-box with Scenery's DOM display support, although Scenery doesn't seem to properly detect the vertical bounds (blue part is where it thinks the formula is):
Seems fast to update, and suitable for fully dynamic formulas.
The span with 'katex-html' has a very accurate getBoundingClientRect() in Chrome. Will presumably need to test browser support to see if we can consistently grab that element and get correct bounds.
Self-contained they said. Easy to bundle they said.
It makes the following woff/woff2 (web font file) requests:
Without the fonts:
With the fonts:
It's likely I can find a way to bundle it all together by patching KaTex, but it's not as simple as previously expected (I was using a CDN to do my early tests).
http://sosweetcreative.com/2613/font-face-and-base64-data-uri suggests it's likely we'll be able to embed font files in the CSS, however it's probably another bump of ~100kb to deliver all formats with the sim (possibly more, or possibly less if we subset the fonts).
http://caniuse.com/#feat=woff indicates we'd be OK just including the woff files. http://caniuse.com/#feat=woff2 shows we can't just rely on woff2, and we don't want to ship two versions.
Looks like embedding is working well. We'll want to probably customize the embedded font files based on a sim's needs, and we'll probably have most sims with the same needs.
Testing:
Though still having issues with bounds:
It looks like it's somewhat off because the browser hadn't loaded/processed the web fonts yet. Triggering loading of these before construction results in something improved:
May need to look into tricks for detecting resizing like http://smnh.me/web-font-loading-detection-without-timers/, but that would still have incorrect bounds potentially on initialization.
It's possible to trigger loading of webfonts, and use something like https://github.com/JenniferSimonds/FontDetect to poll for when it's loaded (inspects width to see if it changed).
More about detecting changes, with metric-compatible fonts: http://blog.typekit.com/2013/02/05/more-reliable-font-events/
Adding developer-meeting tag to decide on code review and/or briefly discuss the generation of sherpa preload files.
I am noticing two things as I test this in trig-tour:
1) As I understand it, katex-0.5.1-css-all.js expects the third party libraries in the body of the top level html file. This will require us to place third party libraries in the body of the top level html file for all sims. I am not sure if this is an issue. We could also just do this for sims that require FormulaNode, but I would rather have the top level html file be of similar structure for all sims.
2) Formula node cannot handle translatable strings. For instance:
var xString = require( 'string!TRIG_TOUR/x' );
var string = '\\frac{\\sqrt{' + xString + '}}{\\sqrt{x_2^2+x_3^2}}';
fails while
var string = '\\frac{\\sqrt{' + 'x' + '}}{\\sqrt{x_2^2+x_3^2}}';
passes through OK. Perhaps katex can't handle the encodings on our translatable strings? In this case, I think it is fine to pull 'x' out of translatable strings anyway, but it may be desirable to build up fractions or functions with translatable strings.
I am a unfamiliar with TeX notation, so sorry if I am missing something here.
As I understand it, katex-0.5.1-css-all.js expects the third party libraries in the body of the top level html file.
Yup, that's basically what https://github.com/phetsims/chipper/issues/63 does (made commits for this ~3 hours ago).
Perhaps katex can't handle the encodings on our translatable strings?
Perhaps, I'll look into it (assigning to me). If so, FormulaNode should handle stripping those characters out that won't work.
I am a unfamiliar with TeX notation, so sorry if I am missing something here.
Looks great to me.
Checked master of KaTeX, saw https://github.com/Khan/KaTeX/commit/d423bec08921b293e0beecaa718226e310ba3ef9, which revealed https://github.com/Khan/KaTeX/issues/243 (related bug).
It looks like the 0.5.1 release doesn't have those improvements, which look like they will explicitly handle treating the bidirectional characters for text (it's within the unicode character ranges in the new regex).
I'll try building and packaging out of master, to see if it resolves the issue.
After further testing, KaTeX won't handle anything that isn't included in its fonts, even disallowing certain latin-accented characters much less arbitrary Unicode. Master just seems to fix the lexer, not the general behavior. See https://github.com/Khan/KaTeX/issues/15 (still open).
For now, we'll just need to not make them translatable (as translators could easily make incompatible strings).
Assigning back to @jessegreenberg.
From testing issue:
Square root testing issue: https://github.com/phetsims/tasks/issues/425#issuecomment-157080541
Vertical lines issue: https://github.com/phetsims/tasks/issues/425#issuecomment-157110752
For now, we'll just need to not make them translatable (as translators could easily make incompatible strings).
Agreed, it seems like it would be too easy for translators to add a breaking TeX string.
Other general code review comments:
displayMode
until reading documentation in setDisplayNode()
. Defining 'inline math' for the options might help clarify.Otherwise, FormulaNode.js is looking good. I was able to play with it a bit in trig tour, but testing was cut short due to issues with the square root on some platforms: https://github.com/phetsims/tasks/issues/425. Otherwise, what I have tested is working well.
Hey developers! I was just talking with @pixelzoom about upcoming math sims, and we're noticing again that it would be useful to not have to customize the appearance of mathematical statements for every sim. The upcoming sims include Pascal's Triangle, the Area Model Multiplication suite, and also the NumberKeypad (https://github.com/phetsims/scenery-phet/issues/283) .
Deferred until Dec 8
Consensus was it would be best to build our own custom layout manager that leverages scenery nodes as part of the coding effort of Area Model. The idea would be to begin by supporting simple expressions like polynomials, fractions, and super/subscripts, with the ability to add features as needed.
@jessegreenberg will create (and name) a new repo for this work @amanda-phet will create the first issue in that repo which will be for documenting the initial design objectives (@samreid suggested giving several visual examples of the types of expressions envisioned to be supported in the near'ish future, such as colored text in polynomials, potential background highlighting, etc)
supporting simple expressions like polynomials, fractions, and super/subscripts, with the ability to add features as needed.
And the expressions shouldn't be limited to well-formed expressions. PhET needs the ability to render fragments of expressions, which can then be combined with interactive UI components. See https://github.com/phetsims/scenery/issues/457#issuecomment-135838479.
In https://github.com/KaTeX/KaTeX/issues/1046, it seems like there exists a unicodeTextInMathMode
which can be used to make formulas with unicode characters. Because it seems like the conclusion to https://github.com/phetsims/rosetta/issues/183 is that math symbols should be translatable, but the conclusion to https://github.com/phetsims/scenery/issues/457#issuecomment-156634759 is that formulas shouldn't be translatable, I'll put a meeting:developer label on this issue.
The linked issue in https://github.com/phetsims/scenery/issues/457#issuecomment-156634759 has since been closed. This issue is currently affecting https://github.com/phetsims/curve-fitting/issues/132.
What type of symbols are translators planning on using? It looks like unicodeTextInMathMode
was merged to master in https://github.com/KaTeX/KaTeX/pull/1117, do they have a release where that is included that we could test?
I'm not too sure what symbols will be used, but certain characters right now cause KaTeX to crash and by extension cause the simulation to freeze. For example, when I allowed the formula in Curve Fitting to be translated, the simulation would fail the rtl stringTest because the presumably Arabic characters weren't allowed for math rendering. Looking at https://github.com/phetsims/rosetta/issues/183, it seems like substituting x and y for chi and psi is common.
I would think that there is at least one KaTeX release since the pull request because the pull request was merged on Feb 19, 2018 and the latest release (0.10.2) happened on May 12 of this year. If I'm looking at sherpa correctly, the version of KaTeX we are using is 0.5.1 which was released on Sep 1, 2015 which is well before the PR.
@ariel-phet, it sounds good for us to look into using a newer KaTeX version. Updating may be a bit more involved, as we have about ~150 lines of code dedicated to packaging KaTeX resources into a file that is a self-contained preload (see sherpa/katex/packageKatexCSS.js).
Should I look into this?
Decided at developer meeting that I'll write up some more details and assign to @SaurabhTotey for investigation. If there are problems or questions, let me know.
Over in https://github.com/phetsims/vector-addition/issues/100, I noted that the KaTex update is (preferably) a prerequisite to publishing Vector Addition 1.0.0. That means it should be completed in the next month.
Just as an update, I have managed to get KaTeX 0.11.0 packaged in Sherpa, and I have committed that locally (not pushed yet). This seems to include KaTeX correctly, but it breaks FormulaNode because the FormulaNode Bounds computations are now incorrect. I am working on seeing if I can get correct Bounds computations before I push everything.
I have migrated Sherpa to use the new KaTeX and FormulaNode now works the same as before. I will now start investigating how to incorporate unicode characters into formulas.
Formulas in FormulaNode now allow for unicode characters. However, if the font doesn't support a character, a warning is thrown.
As such, I am wondering: is this acceptable behaviour? If so, the migration to a newer KaTeX version is done.
Do we know what the unsupported characters are? (Does that mean it doesn't have a custom font for it and it still shows up, OR that something unsupported just would be invisible or something)?
I believe that it means that it doesn't have a custom font for the characters, but they still show up. It seems like the RTL string is still getting substituted in and it seems to fully visible from what I can tell. I don't know what the unsupported characters are.
On the KaTeX documentation, they say
KaTeX will accept all Unicode letters in both text and math mode. All unrecognized characters will be treated as if they appeared in text mode, and are subject to the same issues of using system fonts and possibly using incorrect vertical alignment.
That seems like acceptable behavior to me, although if there's a way to suppress the console output, that might be nice.
From dev meeting 08/29/19:
@pixelzoom has experienced some whacky behavior with bounds in Vector Addition when working with KaTeX. FormulaNode's baseline didn't cooperate with Scenery's baseline in the equation.
@ariel-phet We don't have any math sims on the horizon so we can continue with a temporary in-house solution. We can revisit in the future, but for now, just use what is required for sims like Curve Fitting.
@SaurabhTotey doesn't need to take further action. For future reference, use FormulaNode sparingly and a custom solution may be used on a case-by-case scenario. Closing this issue.
We need a way to consistently incorporate math elements in to math sims. For example, fractions, square roots, variables, and other math symbols, should be easy to implement in any new math sim (with an appearance consistent with LaTex). Graphs should also have a consistent appearance.
I've started a google doc with some ideas and resources. We should probably try to discuss this in detail and flesh out all of the needs but I wanted to get a github issue started since we may need this before Function Builder gets started, as it has some complicated equations.
Kathy suggested starting an issue in scenery. Assigning to @ariel-phet for prioritization and appropriate assignment.