jroachell15 commented 2 years ago

Essential implement this package into the app:

Getting citations from CrossRef. We currently already have fields with user given reference texts from this the system gets a DOI (if this was not already assigned). Now, I wish to have citations in different formats:
[x] "rdf-xml"
[x] "turtle"
[x] "citeproc-json"
[x] "citeproc-json-ish"
[x] "text", "ris", "bibtex" (default)
[x] "crossref-xml"
[ ] "datacite-xml"

Error in cn(dois, ...) : Format 'datacite-xml' for 'https://doi.org/10.1016/j.jasrep.2017.07.030' is not supported by the DOI registration agency: 'crossref'. Try one of the following formats: rdf-xml, turtle, citeproc-json, citeproc-json-ish, text, ris, bibtex, crossref-xml, bibentry, crossref-tdm
[x] "bibentry"

Error: Please install bibtex solution: install.packages("bibtex"), then run the code again using format = "bibentry"
[x] "crossref-tdm).
see this R package that does this: https://cran.r-project.org/web/packages/rcrossref/rcrossref.pdf
You can save these different formats in separate columns (this can be done while obtaining the DOI but if the DOI is given by the user then should be used instead to give citation – in none found then write in non).
The user when exporting citations (button citation type) could then select which formats to include (using checkboxes).

other resources about DOI feature:

jroachell15 commented 2 years ago

INWT comment: August 21, 2021 CrossRef: we can only offer the on-the-fly solution, we won’t alter the data format in the package;

[ ] this will include an option to query the database and select one format from a dropdown, which will then be added as an additional column: ~1-2 day or 12 h (assuming, that the R package works as expected)

jroachell15 commented 2 years ago

Step 1: Register for the Polite Pool give email address

Be nice and share your email with Crossref The Crossref team encourage requests with appropriate contact information and will forward you to a dedicated API cluster for improved performance when you share your email address with them.

https://github.com/CrossRef/rest-api-doc#good-manners–more-reliable-service

To pass your email address to Crossref via this client, simply store it as environment variable in .Renviron like this:

[x] 1. Open file: file.edit("~/.Renviron")
[x] 2. Add email address to be shared with Crossref crossref_email = name@example.com
[x] 3. Save the file and restart your R session

jroachell15 commented 2 years ago

Step 2: Testing the Rcrossref R package with these functions:

FUNCTION to implement:

cr_cn( dois, format = "bibtex", style = "apa", locale = "en-US", raw = FALSE, .progress = "none", url = NULL, ... )

Example:

cr_cn(dois="https://doi.org/10.1016/j.jasrep.2017.07.030", format= "bibtex") :

[1] "@article{Salesse_2018,\n\tdoi = {10.1016/j.jasrep.2017.07.030},\n\turl = {https://doi.org/10.1016%2Fj.jasrep.2017.07.030},\n\tyear = 2018,\n\tmonth = {jun},\n\tpublisher = {Elsevier {BV}},\n\tvolume = {19},\n\tpages = {1050--1055},\n\tauthor = {Kevin Salesse and Ricardo Fernandes and Xavier de Rochefort and Jaroslav Br{\r{u}}{\v{z}}ek and Dominique Castex and {\'{E}}lise Dufour},\n\ttitle = {{IsoArcH}.eu: An open-access and collaborative isotope database for bioarchaeological samples from the Graeco-Roman world and its margins},\n\tjournal = {Journal of Archaeological Science: Reports}\n}"
cr_cn(dois="https://doi.org/10.1016/j.jasrep.2017.07.030", format= "text"):

[1] "Salesse, K., Fernandes, R., de Rochefort, X., Brůžek, J., Castex, D., & Dufour, É. (2018). IsoArcH.eu: An open-access and collaborative isotope database for bioarchaeological samples from the Graeco-Roman world and its margins. Journal of Archaeological Science: Reports, 19, 1050–1055. https://doi.org/10.1016/j.jasrep.2017.07.030"
cr_cn(dois="https://doi.org/10.1016/j.jasrep.2017.07.030", format= "rdf-xml") :

{xml_document} RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:j.0="http://purl.org/dc/terms/" xmlns:j.1="http://prismstandard.org/namespaces/basic/2.1/" xmlns:owl="http://www.w3.org/2002/07/owl#" xmlns:j.2="http://purl.org/ontology/bibo/" xmlns:j.3="http://xmlns.com/foaf/0.1/"> [1] \n <j.1:doi>10.1016/j.jasrep.2017.07.030</j.1:doi>\n <j.1:startingPage>105 ...
cr_cn(dois="https://doi.org/10.1016/j.jasrep.2017.07.030", format= "ris") :

[1] "TY - JOUR\nDO - 10.1016/j.jasrep.2017.07.030\nUR - http://dx.doi.org/10.1016/j.jasrep.2017.07.030\nTI - IsoArcH.eu: An open-access and collaborative isotope database for bioarchaeological samples from the Graeco-Roman world and its margins\nT2 - Journal of Archaeological Science: Reports\nAU - Salesse, Kevin\nAU - Fernandes, Ricardo\nAU - de Rochefort, Xavier\nAU - Brůžek, Jaroslav\nAU - Castex, Dominique\nAU - Dufour, Élise\nPY - 2018\nDA - 2018/06\nPB - Elsevier BV\nSP - 1050-1055\nVL - 19\nSN - 2352-409X\nER - \n"
cr_cn(dois="https://doi.org/10.1016/j.jasrep.2017.07.030", format= "turtle") :

[1] "http://id.crossref.org/contributor/elise-dufour-183ka07b0x9l7\n a http://xmlns.com/foaf/0.1/Person ;\n http://xmlns.com/foaf/0.1/familyName\n \"Dufour\" ;\n http://xmlns.com/foaf/0.1/givenName\n \"Élise\" ;\n http://xmlns.com/foaf/0.1/name\n \"Élise Dufour\" .\n\nhttp://id.crossref.org/contributor/jaroslav-bruzek-183ka07b0x9l7\n a http://xmlns.com/foaf/0.1/Person ;\n http://xmlns.com/foaf/0.1/familyName\n \"Brůžek\" ;\n http://xmlns.com/foaf/0.1/givenName\n \"Jaroslav\" ;\n http://xmlns.com/foaf/0.1/name\n \"Jaroslav Brůžek\" .\n\nhttp://id.crossref.org/contributor/ricardo-fernandes-183ka07b0x9l7\n a http://xmlns.com/foaf/0.1/Person ;\n http://xmlns.com/foaf/0.1/familyName\n \"Fernandes\" ;\n http://xmlns.com/foaf/0.1/givenName\n \"Ricardo\" ;\n http://xmlns.com/foaf/0.1/name\n \"Ricardo Fernandes\" .\n\nhttp://id.crossref.org/issn/2352-409X\n a http://purl.org/ontology/bibo/Journal ;\n http://prismstandard.org/namespaces/basic/2.1/issn\n \"2352-409X\" ;\n http://purl.org/dc/terms/title\n \"Journal of Archaeological Science: Reports\" ;\n http://purl.org/ontology/bibo/issn\n \"2352-409X\" ;\n http://www.w3.org/2002/07/owl#sameAs\n \"urn:issn:2352-409X\" .\n\nhttp://id.crossref.org/contributor/dominique-castex-183ka07b0x9l7\n a http://xmlns.com/foaf/0.1/Person ;\n http://xmlns.com/foaf/0.1/familyName\n \"Castex\" ;\n http://xmlns.com/foaf/0.1/givenName\n \"Dominique\" ;\n http://xmlns.com/foaf/0.1/name\n \"Dominique Castex\" .\n\nhttp://id.crossref.org/contributor/xavier-de-rochefort-183ka07b0x9l7\n a http://xmlns.com/foaf/0.1/Person ;\n http://xmlns.com/foaf/0.1/familyName\n \"de Rochefort\" ;\n http://xmlns.com/foaf/0.1/givenName\n \"Xavier\" ;\n http://xmlns.com/foaf/0.1/name\n \"Xavier de Rochefort\" .\n\nhttp://dx.doi.org/10.1016/j.jasrep.2017.07.030\n http://prismstandard.org/namespaces/basic/2.1/doi\n \"10.1016/j.jasrep.2017.07.030\" ;\n http://prismstandard.org/namespaces/basic/2.1/endingPage\n \"1055\" ;\n http://prismstandard.org/namespaces/basic/2.1/startingPage\n \"1050\" ;\n http://prismstandard.org/namespaces/basic/2.1/volume\n \"19\" ;\n http://purl.org/dc/terms/creator\n http://id.crossref.org/contributor/elise-dufour-183ka07b0x9l7 , http://id.crossref.org/contributor/kevin-salesse-183ka07b0x9l7 , http://id.crossref.org/contributor/xavier-de-rochefort-183ka07b0x9l7 , http://id.crossref.org/contributor/dominique-castex-183ka07b0x9l7 , http://id.crossref.org/contributor/jaroslav-bruzek-183ka07b0x9l7 , http://id.crossref.org/contributor/ricardo-fernandes-183ka07b0x9l7 ;\n http://purl.org/dc/terms/date\n \"2018-06\"^^http://www.w3.org/2001/XMLSchema#gYearMonth ;\n http://purl.org/dc/terms/identifier\n \"10.1016/j.jasrep.2017.07.030\" ;\n http://purl.org/dc/terms/isPartOf\n http://id.crossref.org/issn/2352-409X ;\n http://purl.org/dc/terms/publisher\n \"Elsevier BV\" ;\n http://purl.org/dc/terms/title\n \"IsoArcH.eu: An open-access and collaborative isotope database for bioarchaeological samples from the Graeco-Roman world and its margins\" ;\n http://purl.org/ontology/bibo/doi\n \"10.1016/j.jasrep.2017.07.030\" ;\n http://purl.org/ontology/bibo/pageEnd\n \"1055\" ;\n http://purl.org/ontology/bibo/pageStart\n \"1050\" ;\n http://purl.org/ontology/bibo/volume\n \"19\" ;\n http://www.w3.org/2002/07/owl#sameAs\n <info:doi/10.1016/j.jasrep.2017.07.030> , <doi:10.1016/j.jasrep.2017.07.030> .\n\nhttp://id.crossref.org/contributor/kevin-salesse-183ka07b0x9l7\n a http://xmlns.com/foaf/0.1/Person ;\n http://www.w3.org/2002/07/owl#sameAs\n http://orcid.org/0000-0003-2492-1536 ;\n http://xmlns.com/foaf/0.1/familyName\n \"Salesse\" ;\n http://xmlns.com/foaf/0.1/givenName\n \"Kevin\" ;\n http://xmlns.com/foaf/0.1/name\n \"Kevin Salesse\" .\n"
[x] Package works in the development environment

jroachell15 commented 2 years ago

Step 3: Modify functions under DOI feature:

https://github.com/Pandora-IsoMemo/iso-app/blob/main/R/03-dataExplorer.R

You can save these different formats in separate columns (this can be done while obtaining the DOI but if the DOI is given by the user then should be used instead to give citation – in none found then write in non).

functions to look at: generateCitation()

Ask Andreas: (1) Should I take the input of this column compiledDOI or these other columns for the input of the new function? cr_cn(dois="https://doi.org/10.1016/j.jasrep.2017.07.030", format= "turtle")

User Interface: change to be a drop down to select the different format types (bibtext, xml, etc.)

Is this where the output of the cr_cn() goes? but where does the function go?

jroachell15 commented 2 years ago

@jroachell15

Confirm with Ricardo that we can remove the current drop down box and replace with the output of the functions:
and confirm: which DOI input should be the input:
- compilation DOI
- originalDOI
- databaseDOI

arunge commented 2 years ago

@isomemo As we just discussed, the idea is to add

the input for selecting the format of the reference here at (1) and call it "Citation Format" and
to keep the option "Citation Type" (2) but to rename this to "Citation Export FileType"

Is that correct?

jroachell15 commented 2 years ago

@isomemo Isomemo/Pandora will summarize and make clear of the instructions https://cran.r-project.org/web/packages/rcrossref/rcrossref.pdf

isomemo commented 2 years ago

@jroachell15 @arunge

The crossref R package allows on to select "format"and "style".

Style refers to the string for each reference. Options include Harvard, Chicago... Format refers to the format in which the text strings for the references will be organized in a file. Options include Bibtex, RIS...

We already have a basic option to export different formats named "Citation type". This should be kept but renamed to "Export format (input style)".

The use of crossrefR package will require two options (to be placed below the option above):

Selection of style from a pending list to be named "Modify citation style" Selection of file from from a pending list to be named "Export format (modified)" Below these selections a button "Export modified citation"

jroachell15 commented 2 years ago

@isomemo because of the implementation of the Style, argument, which was not considered in the beginning, it might take a few more hours to implement than originally anticipated.

Inputs columns: databaseDOI, originalDOI, compilationDOI: when the DOI is empty, the txt reference is generated

User inputs dropdown function: e.g. harvard vs. chicago style: fetch the text string and automatically three new columns generated based on the 3 DOI columns above.

Visually UI:

[x] dropdown for style as input into the function
[x] dropdown for format as input into the function

Look at the server functions, understand where the output of the UI can flow into the current functions of in the server func. Trace the outputs. Functionality:

[ ] Style: apa, ama, chicago, hardvard, MLA, AP @isomemo any other styles you want in the dropdown?
[ ] Format: "rdf-xml", "turtle", "citeproc-json", etc.

Tips: highlight the function+F2 when you change input, Observe: we need a new observe function(name of the dataframe)

First Step:

find the dataframe: put the empty and column names there
output: each column belongs to the DOI and position the output after each relevant DOI
highlight the function+F2 when you change input, Observe: we need a new observe function(name of the dataframe)

Note the difference between Server function and UI script

need to look at:

Namespace.R
R/01-citation.R
R/01-datTable.R -- line 14: add the generateRcrossref() function here in line 14, and make sure to also change the column names
R/03-dataExplorer.R -- -> line 92: selectInput(ns("citationType"), "Export format", selected = "txt", choices = c("txt", "xml", "json")), -> line 93: selectInput(ns("citationStyle"), "Citation style", selected = "txt", choices = c("txt", "xml", "json"))

jroachell15 commented 2 years ago

@isomemo

Ricardo: the options dropdown for Citation: APA style as default. What other style should I include here?

Options drop down for citation format: with bibtex as default

jroachell15 commented 2 years ago

@isomemo please comment on what other citation styles, thanks!

jroachell15 commented 2 years ago

@isomemo

hi Ricardo,

I have encounter a slight performance issue. I have been testing on the backend the Rcrossref function takes a long time to run.

I only ran 10 rows, and its taking about 25 seconds. here is the backend code in R:

generateDOI <- function(DF,citationtype,citationstyle){
  databaseDOIout <- character()
  for (x in DF$databaseDOI) {
    databaseDOIout <- c(databaseDOIout , cr_cn(doi=x, format =  citationtype,style = citationstyle))
  }
  originalDOIout <- character()
  for (x in DF$originalDataDOI) {
    originalDOIout <- c(originalDOIout , cr_cn(doi=x, format =  citationtype,style = citationstyle))
  }
  compilationDOIout <- character()
  for (x in DF$compilationDOI) {
    compilationDOIout <- c(compilationDOIout , cr_cn(doi=x, format =  citationtype,style = citationstyle))
  }
  DOI <- cbind(databaseDOIout, originalDOIout,compilationDOIout)
  DF <- cbind(DF,DOI)
  return(DF)
}

Because we have for every row, three API requests for each DOI column: OriginalDataDOI, compilationDOI, and databaseDOI, its taking a long time.

So according to this issue, the performance of this feature will be very slow, if the user picks 14CSea as a database (with about 2000 rows) and chooses bibex as format and APA as the style, it will approximately take 83 minutes. I don't think an user would wait this long. We should talk about this issue further, perhaps the next meeting.

Potential solution: I see there are many of the DOIs are the same, we could just perform the crossref function once for each unique one and paste the rest with the same. From a user perspective, do you think that the user would need the DOI output reference for each row of the data? or are there a different way to express these citation styles for every unique DOI?

jroachell15 commented 2 years ago

Andreas:

users write down reference, but sometimes not the DOI ,
script does a DOI search, at the same time, also search a string and save the style bibtex reference,
then converting citation: input is bibtext string operation,

Optimizing code:

lapply():

arunge commented 2 years ago

Since there was no possibility here to receive a speed for the citation convertion that is applicable in an interactive context, we decided to adjust the code, where the data sources are loaded and written to a database: https://github.com/Pandora-IsoMemo/iso-data The app only loads pre-formated tables from this database.

New goals:

add a "bibtex" citeation style column to the database, such that the app only needs to translate from one style into another style. The translation is only a string operation that should be much faster than loading references from DOI on the fly.
implement the UI and logic to change the reference style

@jroachell15 I will take over this ticket and develop a solution here.

@isomemo We need to put this as another task into the new task list.

isomemo commented 2 years ago

Sounds good! I would assume that BibTeX format is possible and the best choice here. The fields included in the BibTeX format are described here: https://www.bibtex.com/g/bibtex-format/

If asked, choose all fields. Citekey identifies each individual citation.

For the batch export of multiple citations, I believe that it is just necessary to concatenate the individual citations leaving an empty line in between citations for readibility.

arunge commented 1 year ago

There is a newer ticket regarding updates regarding BibTex and crossRef. I am closing here.

Pandora-IsoMemo / DSSM

[MPI_2022]: Pandora Iso-app - data tables plus interactive map (Part 2) #14

Step 1: Register for the Polite Pool give email address

Step 2: Testing the Rcrossref R package with these functions:

Step 3: Modify functions under DOI feature: