ropensci-archive / rorcid

:warning: ARCHIVED :warning: A programmatic interface the Orcid.org API
Other
109 stars 13 forks source link

Parsing Orcid publications (aka 'works') #3

Closed sckott closed 9 years ago

sckott commented 12 years ago

Hey @cboettig, hoping you can help figure out how to parse these citations from Orcid? This is an example call using Peter Binfields Orcid ID. You have any functions from knitcitations that we can use to parse this? And probably use those bibtext like citations too?

> temp <- getURL("http://pub.orcid.org/0000-0002-9341-7985/orcid-works")
> xmlParse(temp)
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<orcid-message xmlns="http://www.orcid.org/ns/orcid">
  <message-version>1.0.7</message-version>
  <orcid-profile type="user">
    <orcid>0000-0002-9341-7985</orcid>
    <orcid-history>
      <creation-method>website</creation-method>
      <completion-date>2012-10-16T13:08:32.715+01:00</completion-date>
      <submission-date>2012-10-16T12:39:18.211+01:00</submission-date>
      <claimed>true</claimed>
      <email-verified>true</email-verified>
    </orcid-history>
    <orcid-activities>
      <orcid-works>
        <orcid-work put-code="380908">
          <work-title>
            <title>Academic journal publishing</title>
            <subtitle>Serials Librarian</subtitle>
          </work-title>
          <work-citation>
            <work-citation-type>bibtex</work-citation-type>
            <citation>@article { binfield2008,&#13;
    title = {Academic journal publishing},&#13;
    journal = {Serials Librarian},&#13;
    year = {2008},&#13;
    volume = {54},&#13;
    number = {1-2},&#13;
    pages = {141-153},&#13;
    author = {Binfield, , P. and Rolnik, , Z. and Brown, , C. and Cole, , K.}&#13;
}&#13;
&#13;
</citation>
          </work-citation>
          <publication-date>
            <year>2008</year>
          </publication-date>
          <work-external-identifiers>
            <work-external-identifier>
              <work-external-identifier-type>doi</work-external-identifier-type>
              <work-external-identifier-id>10.1080/03615260801973992</work-external-identifier-id>
            </work-external-identifier>
          </work-external-identifiers>
          <url>http://www.scopus.com/inward/record.url?eid=2-s2.0-67650844237&amp;partnerID=MN8TOARS</url>
        </orcid-work>
        <orcid-work put-code="380909">
          <work-title>
            <title>Image quality factors affecting the reconstruction of laser transmission holograms: interim results</title>
            <subtitle>IEE Conference Publication</subtitle>
          </work-title>
          <work-citation>
            <work-citation-type>bibtex</work-citation-type>
            <citation>@article { binfield1993,&#13;
    title = {Image quality factors affecting the reconstruction of laser transmission holograms: interim results},&#13;
    journal = {IEE Conference Publication},&#13;
    year = {1993},&#13;
    number = {379},&#13;
    pages = {122-128},&#13;
    author = {Binfield, , P. and Watson, , J.}&#13;
}&#13;
&#13;
</citation>
          </work-citation>
          <publication-date>
            <year>1993</year>
          </publication-date>
          <url>http://www.scopus.com/inward/record.url?eid=2-s2.0-0027813935&amp;partnerID=MN8TOARS</url>
        </orcid-work>
        <orcid-work put-code="380910">
          <work-title>
            <title>Modern developments in holography and its materials</title>
            <subtitle>Optics and Laser Technology</subtitle>
          </work-title>
          <work-citation>
            <work-citation-type>bibtex</work-citation-type>
            <citation>@article { binfield1992,&#13;
    title = {Modern developments in holography and its materials},&#13;
    journal = {Optics and Laser Technology},&#13;
    year = {1992},&#13;
    volume = {24},&#13;
    number = {5},&#13;
    pages = {307-308},&#13;
    author = {Binfield, , P.}&#13;
}&#13;
&#13;
</citation>
          </work-citation>
          <publication-date>
            <year>1992</year>
          </publication-date>
          <url>http://www.scopus.com/inward/record.url?eid=2-s2.0-44049114346&amp;partnerID=MN8TOARS</url>
        </orcid-work>
        <orcid-work put-code="380911">
          <work-title>
            <title>Publishing 101: The basics of academic publishing</title>
            <subtitle>Serials Librarian</subtitle>
          </work-title>
          <work-citation>
            <work-citation-type>bibtex</work-citation-type>
            <citation>@article { binfield2008,&#13;
    title = {Publishing 101: The basics of academic publishing},&#13;
    journal = {Serials Librarian},&#13;
    year = {2008},&#13;
    volume = {54},&#13;
    number = {1-2},&#13;
    pages = {37-42},&#13;
    author = {Rolnik, , Z. and Binfield, , P. and Graves, , T.}&#13;
}&#13;
&#13;
</citation>
          </work-citation>
          <publication-date>
            <year>2008</year>
          </publication-date>
          <work-external-identifiers>
            <work-external-identifier>
              <work-external-identifier-type>doi</work-external-identifier-type>
              <work-external-identifier-id>10.1080/03615260801973414</work-external-identifier-id>
            </work-external-identifier>
          </work-external-identifiers>
          <url>http://www.scopus.com/inward/record.url?eid=2-s2.0-67650815254&amp;partnerID=MN8TOARS</url>
        </orcid-work>
        <orcid-work put-code="380912">
          <work-title>
            <title>Reciprocity failure in continuous wave holography</title>
            <subtitle>Applied Optics</subtitle>
          </work-title>
          <work-citation>
            <work-citation-type>bibtex</work-citation-type>
            <citation>@article { binfield1993,&#13;
    title = {Reciprocity failure in continuous wave holography},&#13;
    journal = {Applied Optics},&#13;
    year = {1993},&#13;
    volume = {32},&#13;
    number = {23},&#13;
    pages = {4337-4343},&#13;
    author = {Binfield, , P. and Galloway, , R. and Watson, , J.}&#13;
}&#13;
&#13;
</citation>
          </work-citation>
          <publication-date>
            <year>1993</year>
          </publication-date>
          <url>http://www.scopus.com/inward/record.url?eid=2-s2.0-0027639305&amp;partnerID=MN8TOARS</url>
        </orcid-work>
        <orcid-work put-code="380913">
          <work-title>
            <title>Reputation, authority and incentives. Or: How to get rid of the Impact Factor</title>
          </work-title>
          <work-citation>
            <work-citation-type>formatted-unspecified</work-citation-type>
            <citation>Brembs, B, Brembs, B &amp; Binfield, P, 2009, 'Reputation, authority and incentives. Or: How to get rid of the Impact Factor', &lt;i&gt;Nature Precedings&lt;/i&gt;.</citation>
          </work-citation>
          <publication-date>
            <year>2009</year>
            <month>01</month>
            <day>21</day>
          </publication-date>
          <work-external-identifiers>
            <work-external-identifier>
              <work-external-identifier-type>doi</work-external-identifier-type>
              <work-external-identifier-id>10.1038/npre.2009.2801</work-external-identifier-id>
            </work-external-identifier>
          </work-external-identifiers>
          <work-contributors>
            <contributor>
              <contributor-attributes>
                <contributor-sequence>first</contributor-sequence>
                <contributor-role>author</contributor-role>
              </contributor-attributes>
            </contributor>
          </work-contributors>
        </orcid-work>
      </orcid-works>
    </orcid-activities>
  </orcid-profile>
</orcid-message>
mfenner commented 12 years ago

Please remember that the work-citation part looks different depending on how the article was imported. The BibTeX probably comes from Scopus, for a work imported from CrossRef it will look like this:

<work-citation>
  <work-citation-type>formatted-unspecified</work-citation-type>
  <citation>Fenner, M, 2008, 'Blogs, Wikis und Podcasts im Unterricht', &lt;i&gt;Biologie in unserer Zeit&lt;/i&gt;, vol. 38, no. 5, pp. 284-286.</citation>
</work-citation>

You should parse the XML based on the work-citation-type.

sckott commented 12 years ago

Thanks for the note.

njahn82 commented 11 years ago

Hey Martin, dear all,

can I expect <work-external-identifier-id> to contain DOI or are you considering others? What happens if none is provided?

sketch:

require(XML)

url <- c("http://pub.orcid.org/0000-0003-1419-2405/orcid-works/")

doc <- xmlTreeParse(url, useInternal=T)

doi <- xpathSApply(doc,"//r:work-external-identifier-id", 
namespaces = (c(r= "http://www.orcid.org/ns/orcid")), xmlValue)
mfenner commented 11 years ago

Najko, content in ORCID can come from several places. If imported from CrossRef, the work will have a DOI, but that might not always be true for other content, e.g. when imported from a file.

njahn82 commented 11 years ago

Thanks Martin, actually I struggle to parse <orcid-works> nodes: within <orcid-work> ORCID API exposes a node referencing to an external ID, a bibtex within <citation> or something else encapsulated within <citation>. Thus, R functions might require if elsehandling.

To illustrate, have a look at:

require(XML)
url <- c("http://pub.orcid.org/0000-0003-1419-2405/orcid-works/")
doc <- xmlTreeParse(url, useInternal=T)
xpathSApply(doc,"//r:work-citation//r:citation",
namespaces = (c(r= "http://www.orcid.org/ns/orcid")), xmlValue)

Do you know whether the ORCID API team seeks to better distinguish between linking to external sources (for fetching bibliographic metadata in a next step) and a format that describes bibliographic data as it is used within the digital library domain (such as bibo or MODS)?

Sorry, I do not know if rorcidis the appropriate forum to address my issues. Please let me know if I have to consult ORCID support.

sckott commented 9 years ago

using json now, this seems fine now