openlibhums / janeway

A web-based platform for publishing journals, preprints, conference proceedings, and books
https://janeway.systems/
GNU Affero General Public License v3.0
168 stars 63 forks source link

Dollar signs in para text improperly triggering MathJax #2658

Closed pgoussy closed 2 years ago

pgoussy commented 2 years ago

Describe the bug If a single paragraph contains more than one dollar sign character ($), all of the text between those characters is treated like a string of LaTeX content and rendered with MathJax.

To Reproduce Steps to reproduce the behavior:

  1. Go to https://journals.publishing.umich.edu/jala/plugins/typesetting/preview_galley/article/1914/galley/440/
  2. Click on Lincoln and the Cantralls: “Too Good a Whig” in the floating TOC
  3. Scroll down to the paragraph beginning "Lincoln's professional relationship..."
  4. Note that a significant chunk of text is rendering in MathJax (and because it's not meant to be LaTeX content, it's rendering very poorly, with no spaces between characters and inconsistent italics). It also extends past the text margin because (I assume) MathJax doesn't know where to insert a line break.

Here's the original JATS tagging: <p>Lincoln&#x2019;s professional relationship with the extended Cantrall family is further revealed when Lincoln, as a money lender, extended a loan to Thomas Cantrall, Levi&#x2019;s oldest son, for $600 on November 28, 1851 for two years with interest at ten per cent, secured with 80 acres of farm land, near the Cantrall survey. On November 29, 1852, Thomas and his wife, Elizabeth, gave Lincoln a new note for $660, releasing the first note. The new note, written and signed by Lincoln, contained on the reverse side three interest payments between 1854 and 1855. Then a tragic accident took Thomas&#x2019;s life when on June 22, 1856, while working at a sawmill, his team of horses became frightened at the blowing of the whistle, ran off, and dragged a log over him. A younger brother, Charles S. Cantrall, administered the estate, paid off the note where Lincoln recorded on it: &#x201C;Received, June 22, 1858 of Charles S. Cantrall, adm. of Thomas Cantrall, eight hundred and twenty-four dollars and twenty-four cents, in full balance of principal and interest on the note.&#x201D;<sup><xref ref-type="fn" rid="fn021">21</xref></sup></p>

I did attempt to replace the "$" characters with the encoded hex reference (&#x0024;) in the JATS, but the final rendering in Janeway was unchanged. I suspect that the hex is converted to a plain "$" during the JATS>HTML transformation, and then MathJax picks it up the same as it would have picked up an unencoded "$".

As a workaround (in the interest of getting this article published ASAP) I will probably replace the default dollar sign characters with a "fullwidth dollar sign" (&#xFF04;), so the above example may no longer be relevant. However, I have attached a screenshot below.

Expected behavior Simply, a dollar sign in JATS should be a dollar sign in Janeway. Or, to put it another way, MathJax shouldn't be activated unless there's explicit math tagging in the JATS.

Screenshots 2021-12-09_14-55-15

pgoussy commented 2 years ago

Through trial and error, I've just realized that a way to (seemingly) avoid this error is to have JATS tagging between the two dollar signs that share a paragraph. I noticed that footnote 20 featured two dollar signs with a few italicized words between them. As an experiment, I tried putting italics tags around a single space character (which has no visible change for the reader) between two dollar signs in another paragraph--and lo and behold, the MathJax stopped rendering in that paragraph. So, that will be my new workaround instead of replacing the dollar sign character: italicizing a single space somewhere between the two dollar signs to "interrupt" the MathJax.

ajrbyers commented 2 years ago

I think all we need to do is change the delimiter for Mathjax and then update the jats to output the new delimiter. Some suggest \$ \$ is a good delimiter as it would rarely occur in natural text.

Get Outlook for iOShttps://aka.ms/o0ukef


From: pgoussy @.> Sent: Thursday, December 9, 2021 8:19:03 PM To: BirkbeckCTP/janeway @.> Cc: Subscribed @.***> Subject: Re: [BirkbeckCTP/janeway] Dollar signs in para text improperly triggering MathJax (Issue #2658)

Through trial and error, I've just realized that a way to (seemingly) avoid this error is to have JATS tagging between the two dollar signs that share a paragraph. I noticed that footnote 20https://journals.publishing.umich.edu/jala/plugins/typesetting/preview_galley/article/1914/galley/440/#heading5 featured two dollar signs with a few italicized words between them. As an experiment, I tried putting italics tags around a single space character (which has no visible change for the reader) between two dollar signs in another paragraph--and lo and behold, the MathJax stopped rendering in that paragraph. So, that will be my new workaround instead of replacing the dollar sign character: italicizing a single space somewhere between the two dollar signs to "interrupt" the MathJax.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/BirkbeckCTP/janeway/issues/2658#issuecomment-990214350, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AB7PSYXMAE252JUUYJ4CFNDUQEFLPANCNFSM5JXLPNUQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

pgoussy commented 2 years ago

That makes sense to me conceptually! Currently, when actual LaTeX content is present, the JATS is tagged with <mml:math> and wrapped in either <inline-formula> or <disp-formula> as appropriate. In other words, I don't actually see any dollar signs in the JATS--so I assume that somehow the XSLT is transforming these tags into the delimiters that are picked up by MathJax?

Either way, let me know once you've made some sort of fix, and I can test if it's worked by reverting my example article back to the version without the italics workaround.

pgoussy commented 2 years ago

Bumping this up because we've encountered it in another journal. Is this on the docket for the next RC (or perhaps a test case for the editable XSLT)?

joemull commented 2 years ago

@pgoussy We added this to the list for 1.4.2 at least

pgoussy commented 2 years ago

2022-02-18_12-46-18