Pandora-IsoMemo / iso-data

ETL for IsoMemo Database
https://pandora-isomemo.github.io/iso-data/
GNU General Public License v3.0
1 stars 0 forks source link

Pandora-Iso-App-Data: Update for getting citations from CrossRef #3

Open arunge opened 2 years ago

arunge commented 2 years ago

Since there was no possibility here to receive a speed for the citation convertion that is applicable in an interactive context, we decided to adjust the code, where the data sources are loaded and written to a database: https://github.com/Pandora-IsoMemo/iso-data The app only loads pre-formated tables from this database.

New goals:

@jroachell15 I will take over this ticket and develop a solution here.

@isomemo We need to put this as another task into the new task list.

Originally posted by @arunge in https://github.com/Pandora-IsoMemo/iso-app/issues/14#issuecomment-1156208382

arunge commented 2 years ago

Here are some more details where to start thanks to @isomemo

This is our code for the DOI assignment: https://github.com/Pandora-IsoMemo/iso-data/blob/main/R/01-doi.R

IMPORTANT: currently this is only being called when a DOI code is missing from one of the 3 DOI fields that we in our IsoMemo data table. In these cases, a text string is used to search for the DOI (using the title of database or compilation or original reference - the 3 reference fields).

The function lookupReference is the one used to locate DOIs from the title string and this relies on the crossref API https://www.crossref.org/documentation/retrieve-metadata/rest-api/

This is the API call in the lookupReference function to get a DOI from a string: res <- httr::GET(url, query = list(query.bibliographic = txt, rows = 1))

The returned value is added to the IsoMemo data table and a flag field within this table is marked as "Yes" to flag that the DOI was fetched via the search rather than directly entered by a user into the database.

IMPORTANT: the above should remain working as is! That is, whenever a DOI is missing (not provided by database creators) these should be fetched and flagged. However, now we call the crossref API for ALL entries, even those that have a user assigned DOI, to get their Bibtex reference. The topic is called DOI negotiation and this page describes it: https://citation.crosscite.org/docs.html

This page, under point 4, lists the accepted formats. There are several options besides Bibtex and these could be used if preferable. Here, I am thinking that we could then use Jian's code to convert among different formats (from Bibtex - or other - into the options offered by the package used by Jian). Since the Bib text would be fetched from the IsoMemo data table (and not via crossref) it should work much faster.

As for the actual code call to get the Bibtex string (or other). The page mentioned above give this example:

$ curl -LH "Accept: application/vnd.citationstyles.csl+json, application/rdf+xml" https://doi.org/10.1126/science.169.3946.635 I suppose that for Bibtex metadata format given as an xml string one would have something like this:

$ curl -LH "Accept: application/x-bibtex, application/rdf+xml" https://doi.org/10.1126/science.169.3946.635

arunge commented 2 years ago

Note: More details on parameters such as bibliographic, rows:

https://github.com/Pandora-IsoMemo/iso-data/blob/408a6148c866fbfaa2dd1e522369b3f4d4d49af2/R/01-doi.R#L11-L15

can be found here: https://www.crossref.org/documentation/retrieve-metadata/rest-api/tips-for-using-the-crossref-rest-api/

image

isomemo commented 2 years ago

@arunge @jroachell15

We will still use Jian's code for the conversion but now instead of directly consulting the (slow) crossref site we will consult our IsoMemo data table where the Bibtex references are going to be stored. These can then be converted into a format selected by the users.

I have created a video that describes my suggested procedure. Please check this carefully: https://youtu.be/49QENRc7zwk

@arunge the bibliographic search is already implemented by Andreas. This returns a DOI from a text reference. What we need is the formatted reference, please see video.

arunge commented 2 years ago

@arunge the bibliographic search is already implemented by Andreas. This returns a DOI from a text reference.

Yes, this is what I found in the code. I just wanted to take this as an example for possible other parameters and to understand the syntax. Just a reminder for me. :wink:

isomemo commented 2 years ago

@arunge Ok :-)