Vitaliy-1 / JATSParser

JATSParser is aimed to be integrated with Open Journal Systems 3.0+ for transforming JATS XML to various formats
GNU General Public License v3.0
11 stars 20 forks source link

Appendix not extracted from JATS XML #12

Open rinorazzi opened 4 years ago

rinorazzi commented 4 years ago

The parser does not extract appendix from the XML. Appendix are tagged with tag withing the Here is an example of XML for appendix

.......... APPENDIX

With respect to our sample, the time spent to clear the 3 consistencies tested ranges from 4 up to 44 seconds. Stratifying linearly this time on 9 levels of 5 seconds, a factor of correction (FOC) is obtained to adjust the p-score in the following way: 0-5 secs =+ 1, 6-10 secs = +2, 11-15 secs = +3, 16-20 secs = +4, 21-25 secs = +5, 26-30 secs = +6, 31-35 secs = +7, 36-40 secs = +8, > 40 secs = +9 (Table V). The sum of the p-score total + FOC represents the timed p-score (tp-score). In this new role, the tp-score ranges from 5 up to 20, expressing itself as a continuum of severity 26. The possibility of a clinical subdivision of the tp-score in further levels is under consideration.

Vitaliy-1 commented 4 years ago

Hi @rinorazzi,

How you usually tag appendix? Is it like in the example of JATS XML standard:

<article dtd-version="1.2">
<front>...</front>
<body>...</body>
<back>
<app-group>
<app>
<title>Appendix</title>
<sec>
<title>The Bipolar Seesaw.</title>
<p>The evidence for the antiphasing of the millennial-duration climate
changes occurring on the Antarctic continent ...</p>
...
</sec>
</app>
</app-group>
<ref-list>...</ref-list>
</back>
</article>

If yes, I can add a support for the simllar tagging in JATS Parser

rinorazzi commented 4 years ago

Hi @Vitaliy-1

Here is a snippet with the tagging of appendixes in the JATS XML I receive for importing in OJS:

<article dtd-version="1.0" .................>
<front> ..............</front> 
<body>.............
.............
<p> .......... <xref ref-type="app" rid="app1-1">Appendix 1</xref> ........... </p>
.............
</body>
<back>
..............
<app-group>
<app id="app1-1">
<title>APPENDIX</title>
<p>Text text text text text text text text 
<sup><xref ref-type="bibr" rid="ref26">26</xref></sup>. Text text text text text text text text 
</p>
</app>
</app-group>
</back>

Compared with your example, in my JATS XML files there is no <sec> inside the <app> tag, I just have <p>tag with his content. But I suppose that a structure that includes <sec> is more correct and flexible. The best would be that JATS Parser accepted both <p> tags (if the appendix just contain a paragraph not stuctured in sections) and a <sec> structure. Also note that there is a reference to the appendix in the <body> By the way, in my XML file I have a similar issue with the <body> tag: in some cases, when the article has no sections, in the <body> there are no <sec> but just <p> tags

Vitaliy-1 commented 4 years ago

Hi @rinorazzi, I was thinking about this lately, would it be OK from your point of view if make each appendix an analog of a section, thus all elements that are supported by section is allowed here, e.g., lists, paragraphs, sections, tables, etc.? It seems to be in line with NISO JATS guidelines.

rinorazzi commented 4 years ago

Hi @Vitaliy-1 your proposal sounds good. The only additional thing that I suggest would be to accept in both <body> and <app> also <p> tags not embedded in <sec>. In such a case both the content of both <body> and <app> wuold be extracet in both the following examples: a) Example 1: use of <sec> in both <body> and <app>

<article dtd-version="1.0" .................>
<front> ..............</front> 
<body>.............

<sec>

   <p> Some text here Some text here  Some text here  .....
   <xref ref-type="app" rid="app1-1">Appendix 1</xref> ........... </p>

</sec>

</body>
<back>
..............
<app-group>
<app id="app1-1">
<title>APPENDIX</title>

<sec>

  <p>Text text text text text text text text 
  <sup><xref ref-type="bibr" rid="ref26">26</xref></sup>. Text text text text text text text text 
  </p>

</sec>

</app>
</app-group>
</back>

b) Example 2: direct use of <p> (not embedded in <sec>) in both <body> and<app>

<article dtd-version="1.0" .................>
<front> ..............</front> 
<body>.............

  <p> Some text here Some text here  Some text here  ..... <xref ref-type="app" rid="app1-1">Appendix 1</xref> ........... 
  </p>

</body>
<back>
..............
<app-group>
<app id="app1-1">
<title>APPENDIX</title>

<p>Text text text text text text text text 
  <sup><xref ref-type="bibr" rid="ref26">26</xref></sup>. Text text text text text text text text 
</p>

</app>
</app-group>
</back>