wpoa / JATS-to-Mediawiki

A PubMed Central to MediaWiki converter
4 stars 1 forks source link

<mixed-citation> not properly handled #3

Open wrought opened 10 years ago

wrought commented 10 years ago

seems xsl should match on this element (I believe there is a disjunction in the control flow on matching one of the multiple possible citation tags), but this did not work for the following article (for instance):

PMC3919830

wrought commented 10 years ago

need to take another pass at patching for this. any thoughts?

Klortho commented 10 years ago

What are the tangible symptoms of the problem? I can take a look if you want. I'm trying to chip away at these issues as I find time.

wrought commented 10 years ago

doi: 10.1371/currents.dis.6773eb9d5e64b733ab490f78de346003

Here's a sample from the text:

<ref-list>
<title>References</title>
<ref id="ref1">
<label>ref1</label>
<mixed-citation>
Haines A, Kovats RS, Campbell-Lendrum D, Corvalan C. Climate change and human health: impacts, vulnerability, and mitigation. Lancet. 2006 Jun 24;367(9528):2101-9. PubMed PMID:16798393.
</mixed-citation>
</ref>
<ref id="ref2">
<label>ref2</label>
<mixed-citation>
Environment Canada. Climate Change: Canada's Action on Climate Change 2009. http://www.ec.gc.ca/Publications/E5E4BB6D-1824-4B00-9512-933F961FD7DD/Climate_Change.pdf. Accessed November 2010.
</mixed-citation>
</ref>

After I process the JATS XML with the converter, the resulting mediawiki markup contains this:

== References ==
&lt;references&gt;&lt;ref name="ref1"&gt;{{Citation

}}
&lt;/ref&gt;
&lt;ref name="ref2"&gt;{{Citation

}}
&lt;/ref&gt;
&lt;ref name="ref3"&gt;{{Citation

Perhaps this is a fringe case, but it seems to be due to the use of <mixed-citation> which, as I understand it, is used from time to time.

wrought commented 10 years ago

case is handled here: https://github.com/wpoa/JATS-to-Mediawiki/blob/master/jats-to-mediawiki.xsl#L754

is there an easy way in XSL to determine if a particular element has no children? If the mixed-citation element has no children, then we should expect it is either a plain text string citation (bummer) or empty. Passing either and skipping the other formatting steps is sufficient for this case, I believe.

Klortho commented 10 years ago

is there an easy way in XSL to determine if a particular element has no children?

Yes, I think: <xsl:if test="not(*)">.

So, you're suggesting not using the Citation template for these, right? You want the output to look like the following, right?

&lt;references&gt;
&lt;ref name="ref1"&gt;Haines A, Kovats RS, Campbell-Lendrum D, Corvalan C. Climate change and human health: impacts, vulnerability, and mitigation. Lancet. 2006 Jun 24;367(9528):2101-9. PubMed PMID:16798393.&lt;/ref&gt;
&lt;ref name="ref2"&gt;Environment Canada. Climate Change: Canada's Action on Climate Change 2009. http://www.ec.gc.ca/Publications/E5E4BB6D-1824-4B00-9512-933F961FD7DD/Climate_Change.pdf. Accessed November 2010.&lt;/ref&gt;