stencila / encoda

↔️ A format converter for Stencila documents
https://stencila.github.io/encoda/
Apache License 2.0
35 stars 9 forks source link

JATS: decoding/encoding of pub-ids in references #413

Open fred-atherden opened 4 years ago

fred-atherden commented 4 years ago

In jats, ids such as dois and pubmed ids are defined using the pub-id element.

See here. This is captured in the jats as:

<ref id="bib2">
    <element-citation publication-type="journal">
        <person-group person-group-type="author">
            <name>
                <surname>Agaoglu</surname>
                <given-names>MN</given-names>
            </name>
            <name>
                <surname>Chung</surname>
                <given-names>ST</given-names>
            </name>
        </person-group>
        <year iso-8601-date="2016">2016</year>
        <article-title>Can (should) theories of crowding be unified?</article-title>
        <source>Journal of Vision</source>
        <volume>16</volume>
        <elocation-id>10</elocation-id>
        <pub-id pub-id-type="doi">10.1167/16.15.10</pub-id>
        <pub-id pub-id-type="pmid">27936273</pub-id>
    </element-citation>
</ref>

After running

encoda convert https://elifesciences.org/articles/42512 48114.jsonld
encoda convert https://elifesciences.org/articles/42512 48114.xml --to jats

it is output as:

{
      "authors": [
        {
          "givenNames": [
            "MN"
          ],
          "familyNames": [
            "Agaoglu"
          ],
          "type": "Person"
        },
        {
          "givenNames": [
            "ST"
          ],
          "familyNames": [
            "Chung"
          ],
          "type": "Person"
        }
      ],
      "title": "Can (should) theories of crowding be unified?",
      "type": "Article",
      "datePublished": "2016",
      "isPartOf": {
        "title": "Journal of Vision",
        "volumeNumber": 16,
        "type": "PublicationVolume"
      },
      "id": "bib2"
    }

and

<ref id="bib2">
                <element-citation>
                    <article-title>Can (should) theories of crowding be unified?</article-title>
                    <person-group person-group-type="author">
                        <name>
                            <surname>Agaoglu</surname>
                            <given-names>MN</given-names>
                        </name>
                        <name>
                            <surname>Chung</surname>
                            <given-names>ST</given-names>
                        </name>
                    </person-group>
                    <year iso-8601-date="2016">2016</year>
                    <source>Journal of Vision</source>
                    <volume>16</volume>
                </element-citation>
            </ref>

respectively.

nokome commented 4 years ago

For decoding this has been addressed by 062a4afa6c6ee66470d782bcd5a0d45510048e00 (not yet merged into master). The JSON-LD produced:

  "identifiers": [
    {
      "name": "publisher-id",
      "propertyID": "https://registry.identifiers.org/registry/publisher-id",
      "value": "42512",
      "type": "PropertyValue"
    },
    {
      "name": "doi",
      "propertyID": "https://registry.identifiers.org/registry/doi",
      "value": "10.7554/eLife.42512",
      "type": "PropertyValue"
    },
    {
      "name": "elocation-id",
      "propertyID": "https://registry.identifiers.org/registry/elocation-id",
      "value": "e42512",
      "type": "PropertyValue"
    }
  ],

which follows the recommendation here on how to encode identifiers in JSON-LD.

Accordingly, changing title so that it only refers to encoding these identifiers to JATS.

nokome commented 4 years ago

Apologies, when I went to change the title I noticed that this is related to id's in references, which has not yet been addressed.

nokome commented 3 years ago

The decoding part of this has now been done in 2fc33f6dade9a245e249cbd8458cfa1899d01902 e.g

  "references": [
    {
      "type": "Article",
      "id": "bib1",
      "authors": [
        {
          "type": "Person",
          "familyNames": [
            "Altman"
          ],
          "givenNames": [
            "DG"
          ]
        },
        {
          "type": "Person",
          "familyNames": [
            "Royston"
          ],
          "givenNames": [
            "P"
          ]
        }
      ],
      "datePublished": {
        "type": "Date",
        "value": "2006"
      },
      "identifiers": [
        {
          "type": "PropertyValue",
          "name": "doi",
          "propertyID": "https://registry.identifiers.org/registry/doi",
          "value": "10.1136/bmj.332.7549.1080"
        },
        {
          "type": "PropertyValue",
          "name": "pmid",
          "propertyID": "https://registry.identifiers.org/registry/pmid",
          "value": 16675816
        }
      ],
      "isPartOf": {
        "type": "PublicationVolume",
        "isPartOf": {
          "type": "Periodical",
          "name": "BMJ"
        },
        "volumeNumber": 332
      },
      "title": "The cost of dichotomising continuous variables"
    },