Open klemay opened 5 years ago
I think this is separate from #414, as #414 refers to writing math expressions in our text editor,
I agree. This is a completely separate issue.
I just came across this after discovering that highlighting mathematics doesn't work. Hypothes.is is new to me: it was suggested as a tool by our Teaching Centre, so I tried adding it to one of my books: http://www.cs.uleth.ca/~fitzpat/apex-hypothesis/sec_continuity.html
If I try to highlight any MathJax-rendered text, I get an annotation mark on the right-hand side, but no highlighting. Probably because MathJax is also JavaScript so now you have two competing JavaScript components trying to render that piece of the page.
If you try to highlight a paragraph containing math (which is almost every paragraph in a calculus book!) the highlighting extends to the first appearance of math.
Thanks for pointing this out, @sean-fitzpatrick.
I'm curious how one selects anything at all in the latest v3 of MathJax. I tried their sample page and couldn't.
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width">
<title>MathJax example</title>
<script src="https://polyfill.io/v3/polyfill.min.js?features=es6"></script>
<script id="MathJax-script" async
src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js">
</script>
</head>
<body>
<p>
When \(a \ne 0\), there are two solutions to \(ax^2 + bx + c = 0\) and they are
\[x = {-b \pm \sqrt{b^2-4ac} \over 2a}.\]
</p>
</body>
</html>
Here is the DOM:
The a in 2a is in the content attribute of an element targeted by a ::before selector. That seems not to be a thing you can select. But maybe there's a mode that makes it so?
Books built with PreTeXt (like mine) are not yet using MathJax 3. (There are some things at risk of breaking that need to be addressed before PreTeXt can upgrade.) The reason things look so complicated when you inspect MathJax is that there is a lot of accessibility support built in. In particular, there are navigation tools that work with screen readers so that a blind reader can parse that content. MathJax also makes it possible to export your math to Nemeth Braille! :-)
If you right click on a MathJax element, you can choose to change the display mode to SVG, but that doesn't help with highlighting.
If you right click on a MathJax element, you can choose to change the display mode to SVG, but that doesn't help with highlighting.
Huh. Live and learn. It's a tricky challenge to be sure, I'm not sure what's the right answer, maybe LaTeX mode for selection?
OK -- some success.
Follow-up. I asked Davide Cervone of MathJax about this on their Google group. Here is his response:
OK, here's what's happening: apparently Hypothesis doesn't just mark the beginning and ending too the annotation, but tries to drill down into the containing HTML and mark individual text regions separately (probably to make it possible to have an annotation cross tag boundaries). For MathJax output, that means it wraps each symbol in a separate annotation tag. In CommonHTML out, the character are actually set as 0 height and are contained in a surrounding tag that adds the proper height and depth (so that the bounding box of the character is tight rather than that of the line height as a whole). Because the annotation tag is inside the container that gives the character its height and depth, that means the annotation height is 0, and it doesn't show up (even though it is there).
The HTML-CSS output doesn't try to make the bounding boxes of the characters be correct, and so when Hypothesis inserts the annotations, they are not zero height, and so show up.
In trying to be smart about the annotations, Hypothesis is getting itself in trouble when dealing with MathJax output. If it were not to descend into tags that are completely within its annotation, for example, then it would be able to highlight CHTML output as well as HTML-CSS output.
Davide
Using HTML-CSS isn't an option: it's deprecated in MathJax v2, and gone in v3. For typical student use with things as they are now, I think we just tell them that they can't highlight math.
Much of this involves how we do text anchoring and selection. On the page discussed in this ticket, we can see that MathJax creates a <script>
element next to each expression. Inside that element lives the clean (original) expressions that can be used to reconstruct a math expression. -- in theory. But in addition to that expression(s), there is also styled unicode chars/text nested in html (primarily <span>
tags) that live in the DOM as well reneded by MathJax
The whole thing for something as simple as just an "x" looks like this
<span class="MathJax" id="MathJax-Element-8-Frame" tabindex="0" data-mathml="<math xmlns="http://www.w3.org/1998/Math/MathML"><mrow class="MJX-TeXAtom-ORD"><mi mathvariant="bold">x</mi></mrow></math>" role="presentation" style="position: relative;">
<nobr aria-hidden="true"><span class="math" id="MathJax-Span-72" style="width: 0.658em; display: inline-block;"><span style="display: inline-block; position: relative; width: 0.507em; height: 0px; font-size: 124%;"><span style="position: absolute; clip: rect(1.918em, 1000.51em, 2.674em, -999.997em); top: -2.518em; left: 0em;"><span class="mrow" id="MathJax-Span-73"><span class="texatom" id="MathJax-Span-74"><span class="mrow" id="MathJax-Span-75"><span class="mi" id="MathJax-Span-76" style="font-family: STIXGeneral; font-weight: bold;">x</span></span></span></span><span style="display: inline-block; width: 0px; height: 2.523em;"></span></span></span><span style="display: inline-block; overflow: hidden; vertical-align: -0.059em; border-left: 0px solid; width: 0px; height: 0.691em;"></span></span></nobr>
<span class="MJX_Assistive_MathML" role="presentation">
<math xmlns="http://www.w3.org/1998/Math/MathML">
<mrow class="MJX-TeXAtom-ORD">
<mi mathvariant="bold">x</mi>
</mrow>
</math>
</span>
</span>
<script type="math/tex" id="MathJax-Element-8">\mathbf{x}</script>
Looking at the resulted textContent
of a parent node, it ends up looking like this
"xx\mathbf{x}"
Which does not make much sense because the hidden expression (which we do want) is merged with the visual styled rendered result which is useless without the encapsulating inline styling -- To be clear, we're not going to lift and shift html into the sidebar.
Now let's look at something a bit more complex...
part of the textContent looks like this "output vector, f:ℝn↦ℝmf:Rn↦Rm\mathbf{f}: \mathbb{R}^n \mapsto \mathbb{R}^m, the matrix of all first-order"
And the script tag's content (for the expressions) looks like this
\mathbf{f}: \mathbb{R}^n \mapsto \mathbb{R}^m
Where does the readable text start and end relative to the rendered expressions text? The only way I can see being able to surely do this is to actually use the HTML structure and make assumptions about classes and then toss away all the text/unicode chars inside of anything with .MathJax
class. Then save the raw expressions in the <script>
tag as the "thing" we want to re-render into an expression in the sidebar alongside any captured plain text. But then we also have to save the correct offsets relative to the raw textContent
so we can place the highlight in the correct space again.
So in this example...
"output vector, f:ℝn↦ℝmf:Rn↦Rm\mathbf{f}: \mathbb{R}^n \mapsto \mathbb{R}^m, the matrix of all first-order"
We need to throw away the part in bold for the sidebar markup, but keep it around to re-anchor so we have correct offsets counts. There may be other ways to accomplish this and we should discuss further, but what I am suggesting here is perhaps a 4th type of Anchor/Selector at least, or possibly major modifications to other similar places in our annotator.
Also, this would only work for MathJax which is specific and fragile. I'm ignoring any larger more general use cases.
One more small fact worth mentioning there. type="math/tex"
is ignored by the browser and assumed to be just a "data block" which simply means it does not get executed as js (as far as I know). Our annotator also captures js types such as type="text/javascript"
and will highlight them just the same. This means that our sidebar annotation blockquotes will always contain the text inside of a script tag if that tag is in the captured range. In the vast majority of cases, script tags are almost never in between content so this is rarely a problem, but if in fact if they are, then folks would be inadvertently capturing and quoting code that they would not otherwise see in the content.
So perhaps there are 2 classes of issues here:
Are there any plans to fix this problems?
A lot of code in potentially relevant areas has changed since the issue was filed, so it will need re-evaluating to figure out which problems still exist and make sure the steps to reproduce are still valid. Nobody has been planning to do that as far as I know.
I think this is separate from https://github.com/hypothesis/product-backlog/issues/414, as #414 refers to writing math expressions in our text editor, whereas this issue is regarding highlighting math expressions as part of an annotation.
Steps to reproduce
Expected behaviour
The highlighted text appears in a new annotation card and you are able to type your annotation in the text editor
Actual behaviour
It's inconsistent. Sometimes a new annotation card is created, but the sidebar doesn't pop open; sometimes nothing happens at all. Even when the annotation card is created, though, the sidebar is very slow to open and the math expressions are not rendered in an intelligible way in the quoted text.
Browser/system information
reported by user and replicated by me on Chrome / Firefox / Safari for Mac.
Additional information
I assume for this to work properly, we'd somehow need to convert what is rendered by mathjax.js back to the original input text? This seems like a feature request rather than a bug to me.