Pandora-IsoMemo / DSSM

Pandora & IsoMemo spatiotemporal modeling (DSSM)
https://pandora-isomemo.github.io/DSSM/
GNU General Public License v3.0
4 stars 1 forks source link

[MPI_2022]: Pandora Iso-app - data tables plus interactive map (Part 2) #14

Closed jroachell15 closed 1 year ago

jroachell15 commented 2 years ago

Essential implement this package into the app:

other resources about DOI feature:

jroachell15 commented 2 years ago

INWT comment: August 21, 2021 CrossRef: we can only offer the on-the-fly solution, we won’t alter the data format in the package;

jroachell15 commented 2 years ago

Step 1: Register for the Polite Pool give email address

Be nice and share your email with Crossref The Crossref team encourage requests with appropriate contact information and will forward you to a dedicated API cluster for improved performance when you share your email address with them.

https://github.com/CrossRef/rest-api-doc#good-manners–more-reliable-service

To pass your email address to Crossref via this client, simply store it as environment variable in .Renviron like this:

jroachell15 commented 2 years ago

Step 2: Testing the Rcrossref R package with these functions:

FUNCTION to implement:

Example:

jroachell15 commented 2 years ago

Step 3: Modify functions under DOI feature:

https://github.com/Pandora-IsoMemo/iso-app/blob/main/R/03-dataExplorer.R

You can save these different formats in separate columns (this can be done while obtaining the DOI but if the DOI is given by the user then should be used instead to give citation – in none found then write in non).

functions to look at: generateCitation()

image Ask Andreas: (1) Should I take the input of this column compiledDOI or these other columns for the input of the new function? cr_cn(dois="https://doi.org/10.1016/j.jasrep.2017.07.030", format= "turtle")

image User Interface: change to be a drop down to select the different format types (bibtext, xml, etc.)

image Is this where the output of the cr_cn() goes? but where does the function go?

jroachell15 commented 2 years ago

@jroachell15

  1. Confirm with Ricardo that we can remove the current drop down box and replace with the output of the functions: image

  2. and confirm: which DOI input should be the input:

    • compilation DOI
    • originalDOI
    • databaseDOI
arunge commented 2 years ago

@isomemo As we just discussed, the idea is to add

Is that correct?

image

jroachell15 commented 2 years ago

@isomemo Isomemo/Pandora will summarize and make clear of the instructions https://cran.r-project.org/web/packages/rcrossref/rcrossref.pdf

isomemo commented 2 years ago

@jroachell15 @arunge

The crossref R package allows on to select "format"and "style".

Style refers to the string for each reference. Options include Harvard, Chicago... Format refers to the format in which the text strings for the references will be organized in a file. Options include Bibtex, RIS...

We already have a basic option to export different formats named "Citation type". This should be kept but renamed to "Export format (input style)".

The use of crossrefR package will require two options (to be placed below the option above):

Selection of style from a pending list to be named "Modify citation style" Selection of file from from a pending list to be named "Export format (modified)" Below these selections a button "Export modified citation"

jroachell15 commented 2 years ago

@isomemo because of the implementation of the Style, argument, which was not considered in the beginning, it might take a few more hours to implement than originally anticipated.

Inputs columns: databaseDOI, originalDOI, compilationDOI: when the DOI is empty, the txt reference is generated

User inputs dropdown function: e.g. harvard vs. chicago style: fetch the text string and automatically three new columns generated based on the 3 DOI columns above.

Visually UI:

Look at the server functions, understand where the output of the UI can flow into the current functions of in the server func. Trace the outputs. Functionality:

Tips: highlight the function+F2 when you change input, Observe: we need a new observe function(name of the dataframe)

First Step:

  1. find the dataframe: put the empty and column names there
  2. output: each column belongs to the DOI and position the output after each relevant DOI
  3. highlight the function+F2 when you change input, Observe: we need a new observe function(name of the dataframe)

Note the difference between Server function and UI script

need to look at:

jroachell15 commented 2 years ago

@isomemo

Ricardo: the options dropdown for Citation: APA style as default. What other style should I include here? image

Options drop down for citation format: with bibtex as default image

jroachell15 commented 2 years ago

@isomemo please comment on what other citation styles, thanks!

jroachell15 commented 2 years ago

@isomemo

hi Ricardo,

I have encounter a slight performance issue. I have been testing on the backend the Rcrossref function takes a long time to run.

I only ran 10 rows, and its taking about 25 seconds. here is the backend code in R:

generateDOI <- function(DF,citationtype,citationstyle){
  databaseDOIout <- character()
  for (x in DF$databaseDOI) {
    databaseDOIout <- c(databaseDOIout , cr_cn(doi=x, format =  citationtype,style = citationstyle))
  }
  originalDOIout <- character()
  for (x in DF$originalDataDOI) {
    originalDOIout <- c(originalDOIout , cr_cn(doi=x, format =  citationtype,style = citationstyle))
  }
  compilationDOIout <- character()
  for (x in DF$compilationDOI) {
    compilationDOIout <- c(compilationDOIout , cr_cn(doi=x, format =  citationtype,style = citationstyle))
  }
  DOI <- cbind(databaseDOIout, originalDOIout,compilationDOIout)
  DF <- cbind(DF,DOI)
  return(DF)
}

Because we have for every row, three API requests for each DOI column: OriginalDataDOI, compilationDOI, and databaseDOI, its taking a long time.

So according to this issue, the performance of this feature will be very slow, if the user picks 14CSea as a database (with about 2000 rows) and chooses bibex as format and APA as the style, it will approximately take 83 minutes. I don't think an user would wait this long. We should talk about this issue further, perhaps the next meeting.

Potential solution: I see there are many of the DOIs are the same, we could just perform the crossref function once for each unique one and paste the rest with the same. From a user perspective, do you think that the user would need the DOI output reference for each row of the data? or are there a different way to express these citation styles for every unique DOI?

jroachell15 commented 2 years ago

Andreas:

Optimizing code:

arunge commented 2 years ago

Since there was no possibility here to receive a speed for the citation convertion that is applicable in an interactive context, we decided to adjust the code, where the data sources are loaded and written to a database: https://github.com/Pandora-IsoMemo/iso-data The app only loads pre-formated tables from this database.

New goals:

@jroachell15 I will take over this ticket and develop a solution here.

@isomemo We need to put this as another task into the new task list.

isomemo commented 2 years ago

Sounds good! I would assume that BibTeX format is possible and the best choice here. The fields included in the BibTeX format are described here: https://www.bibtex.com/g/bibtex-format/

If asked, choose all fields. Citekey identifies each individual citation.

For the batch export of multiple citations, I believe that it is just necessary to concatenate the individual citations leaving an empty line in between citations for readibility.

arunge commented 1 year ago

There is a newer ticket regarding updates regarding BibTex and crossRef. I am closing here.