Open ksachs opened 6 years ago
The team investigated this bug and it's not on INSPIRE side, the abstract has unescaped tags:
<subfield code="a">Applying advances in exact computations of supersymmetric gauge theories, we study the structure of correlation functions in two-dimensional <math altimg="si1.gif" display="inline" overflow="scroll"><mi mathvariant="script">N</mi><mo>=</mo><mo stretchy="false">(</mo><mn>2</mn><mo>,</mo><mn>2</mn><mo stretchy="false">)</mo></math> Abelian and non-Abelian gauge theories. We determine universal relations among correlation functions, which yield differential equations governing the dependence of the gauge theory ground state on the Fayet–Iliopoulos parameters of the gauge theory. For gauge theories with a non-trivial infrared <math altimg="si1.gif" display="inline" overflow="scroll"><mi mathvariant="script">N</mi><mo>=</mo><mo stretchy="false">(</mo><mn>2</mn><mo>,</mo><mn>2</mn><mo stretchy="false">)</mo></math> superconformal fixed point, these differential equations become the Picard–Fuchs operators governing the moduli-dependent vacuum ground state in a Hilbert space interpretation. For gauge theories with geometric target spaces, a quadratic expression in the Givental I -function generates the analyzed correlators. This gives a geometric interpretation for the correlators, their relations, and the differential equations. For classes of Calabi–Yau target spaces, such as threefolds with up to two Kähler moduli and fourfolds with a single Kähler modulus, we give general and universally applicable expressions for Picard–Fuchs operators in terms of correlators. We illustrate our results with representative examples of two-dimensional <math altimg="si1.gif" display="inline" overflow="scroll"><mi mathvariant="script">N</mi><mo>=</mo><mo stretchy="false">(</mo><mn>2</mn><mo>,</mo><mn>2</mn><mo stretchy="false">)</mo></math> gauge theories.</subfield>
Once you fix them on your side, it will appear correctly in the holdingpen.
legacy is dealing with this.
We parse the xml through create_records. So our normal upload is already clean.
However, even batchupload with mathml directly works fine. Example on test (inspirevm16.cern.ch):
Task #1196844 Input file '/opt/cds-invenio/var/tmp-shared/batchupload_sachs_20180702101904_bCqr9U', input mode 'insert'. (I don't have permission to see that file, but I assume the mathml is still in.) Record 1673841 has no mathml
We thought the DESY spider would accept essentially the same xml stucture that could be harvested on legacy.
It seems to be truncated already after create_record
:
In [1]: from dojson.contrib.marc21.utils import create_record
In [2]: create_record('''<datafield tag="245"><subfield code="a">Applying advances in exact computations of supersymmetric gauge theories, we study the structure of correlati
...: on functions in two-dimensional <math altimg="si1.gif" display="inline" overflow="scroll"><mi mathvariant="script">N</mi><mo>=</mo><mo stretchy="false">(</mo><mn>2</m
...: n><mo>,</mo><mn>2</mn><mo stretchy="false">)</mo></math> Abelian and non-Abelian gauge theories. We determine universal relations among correlation functions, which y
...: ield differential equations governing the dependence of the gauge theory ground state on the Fayet–Iliopoulos parameters of the gauge theory. For gauge theories with
...: a non-trivial infrared <math altimg="si1.gif" display="inline" overflow="scroll"><mi mathvariant="script">N</mi><mo>=</mo><mo stretchy="false">(</mo><mn>2</mn><mo>,</
...: mo><mn>2</mn><mo stretchy="false">)</mo></math> superconformal fixed point, these differential equations become the Picard–Fuchs operators governing the moduli-depend
...: ent vacuum ground state in a Hilbert space interpretation. For gauge theories with geometric target spaces, a quadratic expression in the Givental I -function generat
...: es the analyzed correlators. This gives a geometric interpretation for the correlators, their relations, and the differential equations. For classes of Calabi–Yau tar
...: get spaces, such as threefolds with up to two Kähler moduli and fourfolds with a single Kähler modulus, we give general and universally applicable expressions for Pic
...: ard–Fuchs operators in terms of correlators. We illustrate our results with representative examples of two-dimensional <math altimg="si1.gif" display="inline" overflo
...: w="scroll"><mi mathvariant="script">N</mi><mo>=</mo><mo stretchy="false">(</mo><mn>2</mn><mo>,</mo><mn>2</mn><mo stretchy="false">)</mo></math> gauge theories.</subfi
...: eld></datafield>''')
Out[2]:
GroupableOrderedDict([('__order__', ('245!!',)),
('245!!',
GroupableOrderedDict([('__order__', ('a',)),
('a',
'Applying advances in exact computations of supersymmetric gauge theories, we study the structure of correlation functions in two-dimensional ')]))])
we will run xml-files for upload to labs via legacy create_records. So this is not blocking but should be solved. mathml will come from the publishers and labs will have to deal with it.
@ksachs not sure I understand what you mean about "we will run xml-files for upload to labs via legacy create_records". Wouldn't that prevent you from using labs completely for publisher harvests?
We parse the xml via a stand-alone python program which is using our local installation of inspire-legacy. I.e. mis-useing invenio as a xml-parser. The modified xml (after deleting online-first articles etc.) is written to file and put on the ftp server to be harvested by labs.
from invenio.bibrecord import *
....
xmlrecords = xmlfile.read()
recs = create_records(xmlrecords,verbose=1)
xmlfile.close()
newxmlfile = codecs.EncodedFile(codecs.open(....,mode='wb'),'utf8')
newxmlfile.write('<?xml version="1.0" encoding="UTF-8"?>\n<collection>\n')
for recordtuple in recs:
...
modify record
newxmlfile.write(record_xml_output(record))
newxmlfile.write('</collection>\n')
newxmlfile.close()
In abstract (and title?) information with
<math>
gets truncated. Example: https://labs.inspirehep.net/holdingpen/1086582 which contains abstract