Closed schutzjordan closed 5 years ago
Did you start this all with get_dataset(datasettype="pollen surface sample")
?
I'm not sure where sitesdownloadpubs
is coming from. Could you post the top of your code, where you generate the sitesdownloadpubs
variable?
Using dplyr
, neotoma
and purrr
I can create a data.frame
with each dataset ID and the publications associated with that dataset:
library(neotoma)
library(dplyr)
library(purrr)
ssamp <- get_dataset(datasettype = "pollen surface sample",
gpid = c("Canada", "United States"))
sspub <- get_publication(ssamp)
assertthat::assert_that(length(ssamp) == length(sspub), msg = "There are missing publication objects.")
dsids <- (1:length(ssamp)) %>%
map(function(x) {
data.frame(dsid = ssamp[[x]]$dataset.meta$dataset.id,
map(sspub[[x]], function(y) y$meta) %>% bind_rows()) }) %>%
bind_rows()
This indicates that all pollen surface samples from Neotoma in the US & Canada have at least one publication associated with them.
@schutzjordan any update? Did this work for you?
@SimonGoring Hi Simon, sorry for such a late reply! The code you have above seemed to have worked; I'll post below what I had to begin with & how I got sitesdownloadpubs
.
#Use argument datasettype to select pollen surface sample datasets only
CanadianSites <- neotoma::get_dataset(datasettype = "pollen surface sample", gpid = "Canada")
AmericanSites <- neotoma::get_dataset(datasettype = "pollen surface sample", gpid = "United States")
#Combine US and canadian sites into all_sites
US_Can_Sites <- neotoma::bind(AmericanSites, CanadianSites)
#Assign name and download relevant information from combined sites
uscanpollen <- neotoma::get_download(US_Can_Sites)
#get publications for downloaded sites
sitesdownloadpubs <- neotoma::get_publication(uscanpollen)
Glad it worked. I'll close this issue for now. If there are issues feel free to re-open.
Not getting dataset id's of datasets without publications; I've downloaded all of the pollen surface samples from the United States and Canada, and used this loop to try and sort sites that didn't have publications from sites that do.