stencila / encoda

↔️ A format converter for Stencila documents
https://stencila.github.io/encoda/
Apache License 2.0
35 stars 9 forks source link

JATS: Cite Groups #897

Open rgieseke opened 3 years ago

rgieseke commented 3 years ago

When decoding the following JATS example with a citation like literature [1, 2, 3] shows

<?xml version="1.0"?>
<article>
<front>
<article-meta>
<contrib-group />
</article-meta>
</front>
<body>
<p id="p1">Now The literature [<xref ref-type="bibr" rid="bib-bib1">1</xref>, <xref ref-type="bibr" rid="bib-bib2">2</xref>, <xref ref-type="bibr" rid="bib-bib3">3</xref>] shows it clearly.</p>
</body>
<back>
<ref-list>
<title>References</title>
<ref id="bib.bib1">
<mixed-citation>A. Book</mixed-citation>
</ref>
<ref id="bib.bib2">
<mixed-citation>C. Article</mixed-citation>
</ref>
<ref id="bib.bib3">
<mixed-citation>D. Manuscript</mixed-citation>
</ref>
</ref-list>
<app-group />
</back>
</article>

i get the following JSON (shortened to only show paragraph content):

 "Now The literature ",
        {
          "type": "CiteGroup",
          "items": [
            {
              "type": "Cite",
              "target": "bib-bib1",
              "content": [
                "1"
              ]
            },
            {
              "type": "Cite",
              "target": "bib-bib2",
              "content": [
                "2"
              ]
            },
            {
              "type": "Cite",
              "target": "bib-bib3",
              "content": [
                "3"
              ]
            }
          ]
        },

This is not coming from the JATS decoder which doesn't have a decodeCitegroup function. As the square brackets vanish too, i wonder if this is happening in the final conversion somehow? Is this a bug or to be expected?

rgieseke commented 3 years ago

In general it seems whitespace makes a difference:

<?xml version="1.0"?>
<article>
<front>
<article-meta>
<contrib-group />
</article-meta>
</front>
<body>
<p id="p1">This has been shown by [<xref ref-type="bibr" rid="bib-bib1">1</xref>].</p>
<p id="p2">This has been shown by [
<xref ref-type="bibr" rid="bib-bib2">1</xref>
].</p>
</body>
<back>
<ref-list>
<title>References</title>
<ref id="bib.bib1">
<mixed-citation>
A. Foo, A Title, 2021</mixed-citation>
</ref>
<ref id="bib.bib2">
<mixed-citation>
B. Bar, Another Title, 2021</mixed-citation>
</ref>
</ref-list>
</back>
</article>

JSON content:

 "content": [
    {
      "type": "Paragraph",
      "id": "p1",
      "content": [
        "This has been shown by ",
        {
          "type": "Cite",
          "target": "bib-bib1",
          "content": [
            1
          ]
        },
        "."
      ]
    },
    {
      "type": "Paragraph",
      "id": "p2",
      "content": [
        "This has been shown by [\n",
        {
          "type": "Cite",
          "target": "bib-bib2",
          "content": [
            1
          ]
        },
        "\n]."
      ]
    }
  ]