muschellij2 / rscopus

Scopus Database API Interface to R
75 stars 16 forks source link

author_df for a list or vector of authors #7

Closed mcolagrossi closed 6 years ago

mcolagrossi commented 6 years ago

I have a list of more than 9000 authors for which I would like to obtain their publications, the date of such publications and their Author ID to then run other queries. The following:

x = author_df(last_name = "Muschelli", first_name = "John", verbose = FALSE) Returns the information I am looking for. How can I run the query for a list of authors that I have in a data frame such as:

last <- c("Cho", "Mansury", "Ye", "Florida", "Mellander")
first <- c("Jae Beum", "Yuri S.", "Xinyue", "Richard", "Charlotta")
db <- cbind(last, first)

As an example,

x = author_df(last_name = last, first_name = first, verbose = FALSE) returns

Error in `$<-.data.frame`(`*tmp*`, "first_name", value = c("Jae Beum",  : 
  replacement has 5 rows, data has 1

I suspect I have to somehow loop through the values of my variables and/or append the results of the query somehow, but I have not been successful so far.

Thanks,

Marco

muschellij2 commented 6 years ago

You need to loop over each individual separately:

First, let's make a tibble with the data:

library(dplyr)
library(rscopus)
last <- c("Cho", "Mansury", "Ye", "Florida", "Mellander")
first <- c("Jae Beum", "Yuri S.", "Xinyue", "Richard", "Charlotta")
db <- data_frame(last, first)
db = db %>%
    mutate(full_name = paste(first, last))

Then let's make an empty list, populate it with the results of author_df:

all_df = vector(mode = "list", length = nrow(db))
names(all_df) = db$full_name
for (iauth in seq(nrow(db))) {
    x = author_df(last_name = db$last[iauth] , first_name = db$first[iauth], verbose = FALSE)
    all_df[[iauth]] = x
}

Then we can make one big tibble if you want:

all_data = bind_rows(all_df, .id = "full_name")