eblondel / zen4R

zen4R - R Interface to Zenodo REST API
https://github.com/eblondel/zen4R/wiki
Other
44 stars 14 forks source link

multiple addRelatedIdentifier with same relation #61

Closed oggioniale closed 2 years ago

oggioniale commented 3 years ago

I working on the upload of a lot of records in Zenodo for LTER-Italy community. I need to do a relation, througth DOI, between chapter and figures (already uploaded in Zenodo). Also I put a relation between chapter and main book (already uploaded in Zenodo). I have written this code:

for (n in 1:nrow(chapters$figures[[i]])) {
      myrec$addRelatedIdentifier(
        relation = "hasPart",
        identifier = chapters$figures[[i]]$figureDOI[n]
      ) # the reference to the DOI of Figures
    }
myrec$addRelatedIdentifier("isPartOf", bookDOI) # the reference to the DOI of Volume

Using the Zen4R package I obtain only two relations:

Manually, with the Zenodo form in the UI, all the relations are be stored.

I'm wondering if the problem, in the package, is related to the type of relation. Indeed, in the first case, all the figures are related with "hasPart" relation with chapter.

Ale

eblondel commented 3 years ago

Hi @oggioniale i've a test for this, i tested locally and I have no issues with addRelatedIdentifier. In the test case you run is figureDOI always handling different DOIs?

One reason is that if the pair relation/identifier is identical to one relation already added, it won't be added. addRelatedIdentifier returns true if ok, false if already existing relation. As in the test I run, you can add testthat package and and check that output of addRelatedIdentifier, is true with expect_true. If not it means it means that something is wrong in the data (with duplicate figure DOIs probably). This what could explain that you get the first 'hasPart' relation and then the book as 'isPartOf'.

In case you can , send me a reproducible script and I will troubleshoot further.

oggioniale commented 3 years ago

Hi @eblondel, this is a reproduction of my script in a small scale:

# Upload script for Chapter of Volume about LTER-Italy ------ 

# zen4R - R Interface to Zenodo REST API
# Blondel, Emmanuel, & Barde, Julien. (2019, August 27). zen4R: R Interface to Zenodo REST API (Version 0.3). Zenodo. http://doi.org/10.5281/zenodo.3378733
# more info about this package here: https://github.com/eblondel/zen4R/wiki

##
# LTER-Italy token -------
##
tokenLTERItaly <- 'mytoken'

zenodo <- zen4R::ZenodoManager$new(
  # url = "http://sandbox.zenodo.org/api", # to test record management before going with the production Zenodo e-infrastructure
  token = tokenLTERItaly,
  logger = "DEBUG" # "INFO"
)

##
# data ------
##
library(tidyr)

chapters <- readxl::read_excel("./test_Zen4R/chapters.xlsx") %>% tibble::as_tibble()
figures <- readxl::read_excel("./test_Zen4R/figuresDoi.xlsx") %>% tibble::as_tibble()

figures_nested <- figures %>% 
  group_by(chapter) %>%
  nest()

chapters$figures <- figures_nested$data

###
# records management ------
###
### chapters
for (i in 1:nrow(chapters)) {
  # if (chapters$doi == 1) {
  message(
    paste0(
      "\n ---- New chapter ---- \n",
      "chapter n:", chapters$id[[i]], "\n"
    )
  )
  myrec <- zen4R::ZenodoRecord$new()

  myrec$setUploadType("publication")
  myrec$setPublicationType(chapters$type[i])
  myrec$setTitle(chapters$title[i])
  myrec$setDescription(chapters$abstract[i])
  myrec$addCreator(
    firstname = "Alessandro",
    lastname = "Oggioni",
    affiliation = "CNR-IREA"
  )
  myrec$setLicense("CC-BY-SA-4.0")
  myrec$setAccessRight("open")
  myrec$setVersion("1.0")
  myrec$setLanguage("ita")
  myrec$setKeywords(c("LTER-Italy"))
  for (n in 1:nrow(chapters$figures[[i]])) {
    if (!is.na(chapters$figures[[i]]$figureDOI[1])) {
      # this part is just a test
      # message(
      #   paste0(
      #     "hasPart\n",
      #     "DOI", chapters$figures[[i]]$figureDOI[n]
      #   )
      # )
      myrec$addRelatedIdentifier(
        relation = chapters$figures[[i]]$relation[n],
        identifier = chapters$figures[[i]]$figureDOI[n]
      ) # the reference to the DOI of eLTER-SiteFigure
    }
  }
  myrec$addRelatedIdentifier("isPartOf", bookDOI) # the reference to the DOI of Volume
  myrec$setCommunities("lter-italy")
  myrec$setGrants("654359") # eLTER
  myrec$setGrants("871126") # eLTER-PPP
  myrec$setGrants("871128") # eLTER-Plus

  myrec <- zenodo$depositRecord(myrec)
  # }
}

and also 2 Excel sheets that I use as data storage.

chapters.xlsx figuresDoi.xlsx

Let my know.

Ale

eblondel commented 3 years ago

I tested it with my sandbox and it worked (I had to create a 'bookDOI' and removed the "lter-italy" community not available on sandbox). All related identifiers were added correctly.

oggioniale commented 3 years ago

Hi, I executed again the script as I submitted here and don't work. For the first chapter the relations that must be make are 3, one with the main book (DOI 10.5281/zenodo.5570272) and 2 with the figures:

>   chapters$figures[[i]]
# A tibble: 2 × 3
  relation figureTitle   figureDOI             
  <chr>    <chr>         <chr>                 
1 hasPart  title_figure1 10.5281/zenodo.5235920
2 hasPart  title_figure2 10.5281/zenodo.5235918

but I obtain only 2 relation

>   myrec$metadata$related_identifiers
[[1]]
[[1]]$relation
[1] "hasPart"

[[1]]$identifier
[1] "10.5281/zenodo.5235920"

[[2]]
[[2]]$relation
[1] "isPartOf"

[[2]]$identifier
[1] "10.5281/zenodo.5570272"

The same with the second chapter.

Ale

eblondel commented 3 years ago

I'd suggest you wrap the addRelatedIdentifier into testthat::expect_true() to check what happens. addRelatedIdentifier return TRUE if added, FALSE otherwise.

oggioniale commented 3 years ago

I wrap addRelatedIdentifierwith testthat:

for (n in 1:nrow(chapters$figures[[i]])) {
        testthat::expect_true(
          myrec$addRelatedIdentifier(
            relation = chapters$figures[[i]]$relation[n],
            identifier = chapters$figures[[i]]$figureDOI[n]
          ) # the reference to the DOI of eLTER-SiteFigure
        )
    }

The result is:

Error: myrec$addRelatedIdentifier(...) is not TRUE
`actual`:   FALSE
`expected`: TRUE 

:-(

eblondel commented 3 years ago

I've noticed that you run the sandbox url as http, can you switch to https?

I attach here a script that i've tested. A part the url, i've only wrapped the code into lapply to return the list of deposits, after what I check the length of related_identifiers for each record. I get 3 and 5, which match what you should get; Can you try this code? Make sure that you are running the latest.

Script:

#
# LTER-Italy token -------
##
tokenLTERItaly <- "your token"

zenodo <- zen4R::ZenodoManager$new(
   url = "https://sandbox.zenodo.org/api", # to test record management before going with the production Zenodo e-infrastructure
  token = tokenLTERItaly,
  logger = "DEBUG" # "INFO"
)

##
# data ------
##
library(tidyr)

chapters <- readxl::read_excel("./test_Zen4R/chapters.xlsx") %>% tibble::as_tibble()
figures <- readxl::read_excel("./test_Zen4R/figuresDoi.xlsx") %>% tibble::as_tibble()

figures_nested <- figures %>% 
  group_by(chapter) %>%
  nest()

chapters$figures <- figures_nested$data

###
# records management ------
###
### chapters
recs = lapply(1:nrow(chapters), function(i) {
  # if (chapters$doi == 1) {
  message(
    paste0(
      "\n ---- New chapter ---- \n",
      "chapter n:", chapters$id[[i]], "\n"
    )
  )
  myrec <- zenodo$createEmptyRecord()

  myrec$setUploadType("publication")
  myrec$setPublicationType(chapters$type[i])
  myrec$setTitle(chapters$title[i])
  myrec$setDescription(chapters$abstract[i])
  myrec$addCreator(
    firstname = "Alessandro",
    lastname = "Oggioni",
    affiliation = "CNR-IREA"
  )
  myrec$setLicense("CC-BY-SA-4.0")
  myrec$setAccessRight("open")
  myrec$setVersion("1.0")
  myrec$setLanguage("ita")
  myrec$setKeywords(c("LTER-Italy"))
  for (n in 1:nrow(chapters$figures[[i]])) {
    if (!is.na(chapters$figures[[i]]$figureDOI[1])) {
      # this part is just a test
      # message(
      #   paste0(
      #     "hasPart\n",
      #     "DOI", chapters$figures[[i]]$figureDOI[n]
      #   )
      # )
      testthat::expect_true(
      myrec$addRelatedIdentifier(
        relation = chapters$figures[[i]]$relation[n],
        identifier = chapters$figures[[i]]$figureDOI[n]
      )) # the reference to the DOI of eLTER-SiteFigure
    }
  }
  myrec$addRelatedIdentifier("isPartOf", "http://bookDOI") # the reference to the DOI of Volume
  myrec$setGrants("654359") # eLTER
  myrec$setGrants("871126") # eLTER-PPP
  myrec$setGrants("871128") # eLTER-Plus

  return(zenodo$depositRecord(myrec))
  # }
})

sapply(recs, function(x){ length(x$metadata$related_identifiers) })
oggioniale commented 3 years ago

Thanks a lot! Using your script (I changed also the url of tenodo object) I obtained it:

 ---- New chapter ---- 
chapter n:8

[zen4R][INFO] ZenodoManager - Successful record deposition 
[zen4R][INFO] ZenodoManager - Successful record deposition 

 ---- New chapter ---- 
chapter n:9

[zen4R][INFO] ZenodoManager - Successful record deposition 
[zen4R][INFO] ZenodoManager - Successful record deposition 
> sapply(recs, function(x){ length(x$metadata$related_identifiers) })
[1] 2 2

Only 2 resources of both records. I attended 3 related Ids for the first and 5 for the second (included the relation with the book)

eblondel commented 3 years ago

honestly i've no idea why it doesn't work on your side. I will dig further

eblondel commented 2 years ago

@oggioniale back on this issue. First did you find a way to bypass this or stil not? I would be happy to spend some time and try to reproduce it to give a fix.

oggioniale commented 2 years ago

@eblondel not news from my side. I was proceded manually on the Zenodo after the upload through this package. How can we work?

eblondel commented 2 years ago

Can you send me the full output of sessionInfo()? I'll try to reproduce exactly your environment

oggioniale commented 2 years ago

Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Monterey 12.2.1

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] lubridate_1.8.0 remotes_2.4.2   ReLTER_1.1.0    testthat_3.1.1 

loaded via a namespace (and not attached):
  [1] utf8_1.2.2          dtplyr_1.2.0        R.utils_2.11.0      tidyselect_1.1.2    htmlwidgets_1.5.4   grid_4.1.2          maptools_1.1-2      covr_3.5.1          devtools_2.4.3     
 [10] munsell_0.5.0       jqr_1.2.2           codetools_0.2-18    codemeta_0.1.1      ragg_1.2.1          units_0.7-2         xmlparsedata_1.0.5  withr_2.4.3         colorspace_2.0-2   
 [19] knitr_1.37          uuid_1.1-0          rstudioapi_0.13     wk_0.6.0            gitcreds_0.1.1      Rttf2pt1_1.3.9      rcmdcheck_1.4.0     git2r_0.29.0        conditionz_0.1.0   
 [28] polyclip_1.10-0     farver_2.1.0        rprojroot_2.0.3     vctrs_0.4.1         generics_0.1.2      xfun_0.29           worrms_0.4.2        lintr_2.0.1         R6_2.5.1           
 [37] taxize_0.9.99       fields_13.3         rex_1.2.1           cachem_1.0.6        reshape_0.8.8       assertthat_0.2.1    scales_1.1.1        citation_0.6.2      rgeos_0.5-9        
 [46] gtable_0.3.0        waffle_0.7.0        downlit_0.4.0       lwgeom_0.2-8        processx_3.5.2      bold_1.2.0          spam_2.7-0          rlang_1.0.2         clisymbols_1.2.0   
 [55] cyclocomp_1.1.0     whoami_1.3.0        systemfonts_1.0.3   rgdal_1.5-28        extrafontdb_1.0     lazyeval_0.2.2      dichromat_2.0-0     tmap_3.3-2          s2_1.0.7           
 [64] yaml_2.3.5          abind_1.4-5         crosstalk_1.2.0     extrafont_0.17      tools_4.1.2         usethis_2.1.3       xslt_1.4.3          xopen_1.0.0         ggplot2_3.3.5      
 [73] ellipsis_0.3.2      raster_3.5-11       RColorBrewer_1.1-2  proxy_0.4-26        jsonvalidate_1.3.2  sessioninfo_1.2.2   Rcpp_1.0.8.3        plyr_1.8.6          base64enc_0.1-3    
 [82] visNetwork_2.1.0    rnaturalearth_0.1.0 classInt_0.4-3      purrr_0.3.4         ps_1.6.0            prettyunits_1.1.1   openssl_2.0.0       viridis_0.6.2       tmaptools_3.1-1    
 [91] zoo_1.8-9           fs_1.5.1            leafem_0.1.6        crul_1.2.0          magrittr_2.0.3      data.table_1.14.2   gh_1.3.0            qrcode_0.1.4        whisker_0.4        
[100] pkgload_1.2.4       hms_1.1.1           evaluate_0.14       XML_3.99-0.8        leaflet_2.1.1       gridExtra_2.3       cffr_0.2.2          compiler_4.1.2      credentials_1.3.2  
[109] tibble_3.1.6        maps_3.4.0          V8_4.1.0            KernSmooth_2.23-20  crayon_1.5.1        R.oo_1.24.0         htmltools_0.5.2     tzdb_0.2.0          goodpractice_1.0.2 
[118] tidyr_1.1.4         DBI_1.1.2           badgecreatr_0.2.0   tweenr_1.0.2        praise_1.0.0        MASS_7.3-54         sf_1.0-5            sys_3.4             readr_2.1.1        
[127] cli_3.2.0           R.methodsS3_1.8.1   parallel_4.1.2      dotCall64_1.0-1     pkgconfig_2.0.3     pkgdown_2.0.1       foreign_0.8-81      sp_1.4-6            terra_1.4-22       
[136] xml2_1.3.3          roxygen2_7.1.2      foreach_1.5.1       webshot_0.5.2       stringr_1.4.0       callr_3.7.0         digest_0.6.29       httpcode_0.3.0      rosm_0.2.5         
[145] rmarkdown_2.11      leafsync_0.1.0      curl_4.3.2          lifecycle_1.0.1     nlme_3.1-153        jsonlite_1.8.0      askpass_1.1         desc_1.4.1          viridisLite_0.4.0  
[154] fansi_1.0.3         pillar_1.7.0        lattice_0.20-45     fastmap_1.1.0       httr_1.4.2          pkgbuild_1.2.1      waldo_0.3.1         glue_1.6.2          rworldmap_1.3-6    
[163] diffobj_0.3.5       gert_1.4.3          png_0.1-7           iterators_1.0.13    ggforce_0.3.3       class_7.3-19        stringi_1.7.6       rematch2_2.1.2      textshaping_0.3.6  
[172] stars_0.5-5         memoise_2.0.1       dplyr_1.0.8         e1071_1.7-9         ape_5.6-1```
eblondel commented 2 years ago

Ok, i've retried on the sandbox and it still goes well. To make it work I had to load dplyr (since you call group_by upstream in the script); and removed the community you used, because it's not referenced in the sandbox infra. I used the 'zenodo' community for the tests.

I also had to set a fixed bookDOI object hardcoded upstream because it is not part of your test script.

Could you please re-try and paste me the payload that is posted to Zenodo, that you will see when setting logger = DEBUG in the ZenodoManager. See for example the payload that was sent when trying where the different related identifiers (hasPart / isPartOf) are pushed.

`` -> POST /api/deposit/depositions HTTP/1.1 -> Host: sandbox.zenodo.org -> User-Agent: libcurl/7.64.1 r-curl/4.3 httr/1.4.2 -> Accept-Encoding: deflate, gzip -> Accept: application/json, text/xml, application/xml, / -> Content-Type: application/json -> Authorization: Bearer ...... -> Content-Length: 1161 ->

{ "metadata": { "prereserve_doi": true, "upload_type": "publication", "publication_type": "section", "title": "IT25-T", "description": "abstract 2\r\n", "creators": [ { "name": "Oggioni, Alessandro", "affiliation": "CNR-IREA" } ], "license": "CC-BY-SA-4.0", "access_right": "open", "version": "1.0", "language": "ita", "keywords": [ "LTER-Italy" ], "related_identifiers": [ { "relation": "hasPart", "identifier": "10.5281/zenodo.5235914" }, { "relation": "hasPart", "identifier": "10.5281/zenodo.5235912" }, { "relation": "hasPart", "identifier": "10.5281/zenodo.5235910" }, { "relation": "hasPart", "identifier": "10.5281/zenodo.5235907" }, { "relation": "isPartOf", "identifier": "10.4440/505050" } ], "communities": [ { "identifier": "zenodo" } ], "grants": [ { "id": "654359" }, { "id": "871126" }, { "id": "871128" } ] } }

<- HTTP/1.1 201 CREATED <- Server: nginx <- Date: Sun, 01 May 2022 12:57:31 GMT <- Content-Type: application/json <- Content-Length: 1917 ``

oggioniale commented 2 years ago

Dear @eblondel, sorry for my long silent but was on the other projects!

I run the code above and the results is now 3 related Ids for the first record and 5 for the second (included the relation with the book)!

To my side this issue can be closed.