To generate NTCIR compatible (X)HTML output we need to be able to generate the NTCIR file output from the TeX sources. As preliminary task I was trying to reproduce the conversion of https://arxiv.org/format/1305.2869v1
Therefore two steps are required
cat out/S1.p1
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1 plus MathML 2.0//EN" "http://www.w3.org/Math/DTD/mathml2/xhtml-math11-f.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Untitled Document</title>
<!--Generated on Wed Nov 30 10:24:24 2016 by LaTeXML (version 0.8.1) http://dlmf.nist.gov/LaTeXML/.-->
<!--Document created on May 2013.-->
<meta http-equiv="Content-Type" content="application/xhtml+xml; charset=UTF-8"/>
<link rel="stylesheet" href="LaTeXML.css" type="text/css"/>
<link rel="stylesheet" href="ltx-article.css" type="text/css"/>
<link rel="up" href="S1" title="1 Introduction ‣ DCPT-13/17 The dynamics of domain wall Skyrmions"/>
<link rel="up up" href="out" title="DCPT-13/17 The dynamics of domain wall Skyrmions"/>
<link rel="start" href="out" title="DCPT-13/17 The dynamics of domain wall Skyrmions"/>
<link rel="prev" href="S1" title="1 Introduction ‣ DCPT-13/17 The dynamics of domain wall Skyrmions"/>
<link rel="next" href="S2" title="2 The model and its static Skyrmion ‣ DCPT-13/17 The dynamics of domain wall Skyrmions"/>
<link rel="sidebar" href="S1.p2" title="1 Introduction ‣ DCPT-13/17 The dynamics of domain wall Skyrmions"/>
<link rel="sidebar" href="S1.p3" title="1 Introduction ‣ DCPT-13/17 The dynamics of domain wall Skyrmions"/>
<link rel="sidebar" href="S1.p4" title="1 Introduction ‣ DCPT-13/17 The dynamics of domain wall Skyrmions"/>
<link rel="section" href="S1" title="1 Introduction ‣ DCPT-13/17 The dynamics of domain wall Skyrmions"/>
<link rel="section" href="S2" title="2 The model and its static Skyrmion ‣ DCPT-13/17 The dynamics of domain wall Skyrmions"/>
<link rel="section" href="S3" title="3 Skyrmion dynamics ‣ DCPT-13/17 The dynamics of domain wall Skyrmions"/>
<link rel="section" href="S4" title="4 Conclusion ‣ DCPT-13/17 The dynamics of domain wall Skyrmions"/>
<link rel="section" href="Sx1" title="Acknowledgements ‣ DCPT-13/17 The dynamics of domain wall Skyrmions"/>
<link rel="bibliography" href="bib" title="References ‣ DCPT-13/17 The dynamics of domain wall Skyrmions"/>
</head>
<body>
<div class="ltx_page_main">
<div class="ltx_page_header">
<div><a href="S1" title="1 Introduction ‣ DCPT-13/17 The dynamics of domain wall Skyrmions" class="ltx_ref" rel="up"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">1 </span>Introduction</span></a><a href="S1" title="1 Introduction ‣ DCPT-13/17 The dynamics of domain wall Skyrmions" class="ltx_ref" rel="prev"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">1 </span>Introduction</span></a><a href="S2" title="2 The model and its static Skyrmion ‣ DCPT-13/17 The dynamics of domain wall Skyrmions" class="ltx_ref" rel="next"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">2 </span>The model and its static Skyrmion</span></a>
</div></div>
<div class="ltx_page_content">
<div class="ltx_para ltx_authors_1line">
<p class="ltx_p">Skyrmions <cite class="ltx_cite ltx_citemacro_cite">[<a href="bib#bib1" title="" class="ltx_ref">1</a>]</cite> are topological solitons in
generalized sigma models that include a term in the Lagrangian that is
quartic in the derivatives of the field.
The role of this quartic Skyrme term is to provide a fixed finite size for
the Skyrmion, as revealed by Derrick’s theorem <cite class="ltx_cite ltx_citemacro_cite">[<a href="bib#bib2" title="" class="ltx_ref">2</a>]</cite>.
The original Skyrme model is a relativistic theory in (3+1)-dimensions,
where Skyrmions describe baryons within an effective field theory.
There is also a (2+1)-dimensional analogue of this theory,
known as the baby Skyrme model <cite class="ltx_cite ltx_citemacro_cite">[<a href="bib#bib3" title="" class="ltx_ref">3</a>]</cite>. This is a generalization of the
O(3) sigma model, and has proved to be a useful
testing ground for the study of several aspects of Skyrmions.</p>
</div>
</div>
<div class="ltx_page_footer">
<div><a href="S1" title="1 Introduction ‣ DCPT-13/17 The dynamics of domain wall Skyrmions" class="ltx_ref" rel="prev"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">1 </span>Introduction</span></a><a href="bib" title="References ‣ DCPT-13/17 The dynamics of domain wall Skyrmions" class="ltx_ref" rel="bibliography"><span class="ltx_text ltx_ref_title">References</span></a><a href="S2" title="2 The model and its static Skyrmion ‣ DCPT-13/17 The dynamics of domain wall Skyrmions" class="ltx_ref" rel="next"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">2 </span>The model and its static Skyrmion</span></a>
</div>
<div class="ltx_page_logo">Generated on Wed Nov 30 10:24:24 2016 by <a href="http://dlmf.nist.gov/LaTeXML/">LaTeXML <img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAsAAAAOCAYAAAD5YeaVAAAAAXNSR0IArs4c6QAAAAZiS0dEAP8A/wD/oL2nkwAAAAlwSFlzAAALEwAACxMBAJqcGAAAAAd0SU1FB9wKExQZLWTEaOUAAAAddEVYdENvbW1lbnQAQ3JlYXRlZCB3aXRoIFRoZSBHSU1Q72QlbgAAAdpJREFUKM9tkL+L2nAARz9fPZNCKFapUn8kyI0e4iRHSR1Kb8ng0lJw6FYHFwv2LwhOpcWxTjeUunYqOmqd6hEoRDhtDWdA8ApRYsSUCDHNt5ul13vz4w0vWCgUnnEc975arX6ORqN3VqtVZbfbTQC4uEHANM3jSqXymFI6yWazP2KxWAXAL9zCUa1Wy2tXVxheKA9YNoR8Pt+aTqe4FVVVvz05O6MBhqUIBGk8Hn8HAOVy+T+XLJfLS4ZhTiRJgqIoVBRFIoric47jPnmeB1mW/9rr9ZpSSn3Lsmir1fJZlqWlUonKsvwWwD8ymc/nXwVBeLjf7xEKhdBut9Hr9WgmkyGEkJwsy5eHG5vN5g0AKIoCAEgkEkin0wQAfN9/cXPdheu6P33fBwB4ngcAcByHJpPJl+fn54mD3Gg0NrquXxeLRQAAwzAYj8cwTZPwPH9/sVg8PXweDAauqqr2cDjEer1GJBLBZDJBs9mE4zjwfZ85lAGg2+06hmGgXq+j3+/DsixYlgVN03a9Xu8jgCNCyIegIAgx13Vfd7vdu+FweG8YRkjXdWy329+dTgeSJD3ieZ7RNO0VAXAPwDEAO5VKndi2fWrb9jWl9Esul6PZbDY9Go1OZ7PZ9z/lyuD3OozU2wAAAABJRU5ErkJggg==" alt="[LOGO]"/></a></div></div>
</div>
</body>
</html>
vs.
cat refs/1305.2869_1_1.xhtml
<?xml version="1.0" encoding="utf-8"?>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="application/xhtml+xml; charset=UTF-8" /></head><body>
<div class="ltx_para" id="S1.p1">
<p class="ltx_p" id="S1.p1.1">Skyrmions <cite class="ltx_cite">[<a href="#bib.bib1" class="ltx_ref" title="">1</a>]</cite> are topological solitons in
generalized sigma models that include a term in the Lagrangian that is
quartic in the derivatives of the field.
The role of this quartic Skyrme term is to provide a fixed finite size for
the Skyrmion, as revealed by Derrick’s theorem <cite class="ltx_cite">[<a href="#bib.bib2" class="ltx_ref" title="">2</a>]</cite>.
The original Skyrme model is a relativistic theory in (3+1)-dimensions,
where Skyrmions describe baryons within an effective field theory.
There is also a (2+1)-dimensional analogue of this theory,
known as the baby Skyrme model <cite class="ltx_cite">[<a href="#bib.bib3" class="ltx_ref" title="">3</a>]</cite>. This is a generalization of the
O(3) sigma model, and has proved to be a useful
testing ground for the study of several aspects of Skyrmions.</p>
</div>
</body></html>
To generate NTCIR compatible (X)HTML output we need to be able to generate the NTCIR file output from the TeX sources. As preliminary task I was trying to reproduce the conversion of https://arxiv.org/format/1305.2869v1 Therefore two steps are required
However there are still some differences between the converted files
file names
vs
file contents
vs.