CrossRef / rest-api-doc

Documentation for Crossref's REST API. For questions or suggestions, see https://community.crossref.org/
Other
742 stars 269 forks source link

MATHML not represented in a useful way in JSON #325

Open ckoscher opened 6 years ago

ckoscher commented 6 years ago

=== From GO-312

I'm not clear on how MathML should be handled in JSON output (if at all) but APS has noted that in the JSON output,MathML elements are just appearing as separate elements. This is causing display issues downstream.

For example:

http://api.crossref.org/works/10.1103/PhysRevLett.112.102502

title appears as:

β Decay of Ca 38 : Sensitive test of Isospin Symmetry-Breaking Corrections from Mirror Superallowed 0 + → 0 + Transitions

deposited as:

<mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" display="inline"> <mml:mi>β</mml:mi> </mml:math> Decay of <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" display="inline"> <mml:mrow> <mml:mmultiscripts> <mml:mrow> <mml:mi>Ca</mml:mi> </mml:mrow> <mml:mprescripts/> <mml:none/> <mml:mrow> <mml:mn>38</mml:mn> </mml:mrow> </mml:mmultiscripts> </mml:mrow> </mml:math> : Sensitive test of Isospin Symmetry-Breaking Corrections from Mirror Superallowed <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" display="inline"> <mml:msup> <mml:mn>0</mml:mn> <mml:mo>+</mml:mo> </mml:msup> <mml:mo stretchy="false">→</mml:mo> <mml:msup> <mml:mn>0</mml:mn> <mml:mo>+</mml:mo> </mml:msup> </mml:math> Transitions
ckoscher commented 6 years ago

As per sprint planning meeting, we will provide plain title and inline XML title versions.

ckoscher commented 6 years ago

Jennifer Lin [Administrator] added a comment - 11/Oct/17 10:17 PM

Note that we've had a user return asking for update on this (>1 year old) issue.

jenniferlin15 commented 6 years ago

We will create an entry that preserves the XML for all records across all content types, regardless of whether it has markup in the title. The content for this entry will duplicate the as-deposited XML title into a JSON field called 'title-xml'. This field will be part of the JSON document but not an indexed field in Solr. The field will contain XML text as follows: a) content must have the following opening tag:

<title xmlns="http://www.crossref.org/xschema/1.1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:mml="http://www.w3.org/1998/Math/MathML" xsi:schemaLocation="http://www.crossref.org/xschema/1.1 http://doi.crossref.org/schemas/unixref1.1.xsd">

b) content will have the closing tag: </title>

c) contents between opening and closing tag will be the contents of the tag from the UNIXSD XML. This covers all content types. Output example:</p> <pre><code>“title-xml”: [“<title xmlns="http://www.crossref.org/xschema/1.1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:mml="http://www.w3.org/1998/Math/MathML" xsi:schemaLocation="http://www.crossref.org/xschema/1.1 http://doi.crossref.org/schemas/unixref1.1.xsd">NeisseriaBase: a specialised\n <i>Neisseria</i>\n genomic resource and analysis platform</title>”]</code></pre> <p><strong>Multi-title example:</strong> <strong>Input</strong></p> <pre><code><titles> <title> NeisseriaBase: a specialised <i>Neisseria</i> genomic resource and analysis platform </title> <title>My paper</title> </titles></code></pre> <p><strong>Output</strong></p> <pre><code>"title": ["NeisseriaBase: a specialised Neisseria genomic resource and analysis platform", “My paper”], "title-xml": [ “<title xmlns=\"http://www.crossref.org/xschema/1.1\" xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xmlns:mml=\"http://www.w3.org/1998/Math/MathML\" xsi:schemaLocation=\"http://www.crossref.org/xschema/1.1 http://doi.crossref.org/schemas/unixref1.1.xsd\"> NeisseriaBase: a specialised\n<i>Neisseria</i>\n genomic resource and analysis platform</title>”, “<title xmlns=\"http://www.crossref.org/xschema/1.1\" xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xmlns:mml=\"http://www.w3.org/1998/Math/MathML\" xsi:schemaLocation=\"http://www.crossref.org/xschema/1.1 http://doi.crossref.org/schemas/unixref1.1.xsd\"> My paper </title>” ] </code></pre> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/afandian"><img src="https://avatars.githubusercontent.com/u/389345?v=4" />afandian</a> commented <strong> 6 years ago</strong> </div> <div class="markdown-body"> <p>Looks good.</p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/jenniferlin15"><img src="https://avatars.githubusercontent.com/u/3666130?v=4" />jenniferlin15</a> commented <strong> 6 years ago</strong> </div> <div class="markdown-body"> <p>Update: @ckoscher had a convo with member and new solution now in the horizon: We will get the XSLT to format the mathml tech string. This will be run as part of the deposit processing so all downstream outputs will be sorted. Result will retain markdown so that systems can properly represent the characters.</p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/ppolischuk"><img src="https://avatars.githubusercontent.com/u/1800004?v=4" />ppolischuk</a> commented <strong> 6 years ago</strong> </div> <div class="markdown-body"> <p>CS-3845</p> </div> </div> <div class="page-bar-simple"> </div> <div class="footer"> <ul class="body"> <li>© <script> document.write(new Date().getFullYear()) </script> Githubissues.</li> <li>Githubissues is a development platform for aggregating issues.</li> </ul> </div> <script src="https://cdn.jsdelivr.net/npm/jquery@3.5.1/dist/jquery.min.js"></script> <script src="/githubissues/assets/js.js"></script> <script src="/githubissues/assets/markdown.js"></script> <script src="https://cdn.jsdelivr.net/gh/highlightjs/cdn-release@11.4.0/build/highlight.min.js"></script> <script src="https://cdn.jsdelivr.net/gh/highlightjs/cdn-release@11.4.0/build/languages/go.min.js"></script> <script> hljs.highlightAll(); </script> </body> </html>