muschellij2 / rscopus

Scopus Database API Interface to R
74 stars 16 forks source link

when no entry return an error occur #24

Closed hope-data-science closed 4 years ago

hope-data-science commented 5 years ago

I am currently using scopus_search() which is quite convenient with advanced search, however, when the number of output entries is 0, an error was thrown out. This is not convenient when we use it in a loop. Would you please add some codes to deal with this error so that it return a dataframe with 0 rows and same column names? If the missing searches could be recorded, it would even be better. Thanks.

muschellij2 commented 5 years ago

Please provide a MCVE: https://stackoverflow.com/help/mcve.

hope-data-science commented 5 years ago

For example:

library(pacman)
p_load(rscopus,tidyverse)
set_api_key("********************************")
tibble() -> all_df -> all_au -> all_aff

# "1051-0761" could not get results
issn_test = c("0169-5347","1051-0761","2051-1434")

for(i in seq_along(issn_test)){
  res = scopus_search(query = str_c("ISSN(",issn_test[i],") AND  PUBYEAR = 2018"), 
                      max_count = Inf,
                      count = 25,
                      view = "COMPLETE")
  df = gen_entries_to_df(res$entries)

  df$df %>% 
    mutate(issn = issn_test[i]) %>% 
    bind_rows(all_df,.) -> all_df

  df$author %>% 
    mutate(issn = issn_test[i]) %>%  
    bind_rows(all_au,.) -> all_au

  df$affiliation %>% 
    mutate(issn = issn_test[i]) %>% 
    bind_rows(all_aff,.) -> all_aff
}

The above codes gives error:

The query list is: 
list(query = "ISSN(0169-5347) AND  PUBYEAR = 2018", count = 25, 
    start = 0, view = "COMPLETE")
$query
[1] "ISSN(0169-5347) AND  PUBYEAR = 2018"

$count
[1] 25

$start
[1] 0

$view
[1] "COMPLETE"

Response [https://api.elsevier.com/content/search/scopus?query=ISSN%280169-5347%29%20AND%20%20PUBYEAR%20%3D%202018&count=25&start=0&view=COMPLETE]
  Date: 2019-05-03 03:31
  Status: 200
  Content-Type: application/json;charset=UTF-8
  Size: 99.3 kB

Total Entries are 120
5 runs need to be sent with current count
  |=====================================================================================================================| 100%
Number of Output Entries are 120

The query list is: 
list(query = "ISSN(1051-0761) AND  PUBYEAR = 2018", count = 25, 
    start = 0, view = "COMPLETE")
$query
[1] "ISSN(1051-0761) AND  PUBYEAR = 2018"

$count
[1] 25

$start
[1] 0

$view
[1] "COMPLETE"

Response [https://api.elsevier.com/content/search/scopus?query=ISSN%281051-0761%29%20AND%20%20PUBYEAR%20%3D%202018&count=25&start=0&view=COMPLETE]
  Date: 2019-05-03 03:31
  Status: 200
  Content-Type: application/json;charset=UTF-8
  Size: 532 B

Total Entries are 0
Number of Output Entries are 1

Error in UseMethod("mutate_") : 
  no applicable method for 'mutate_' applied to an object of class "NULL"
In addition: Warning message:
In scopus_search(query = str_c("ISSN(", issn_test[i], ") AND  PUBYEAR = 2018"),  :
  May not have received all entries

Therefore, I could not get the all_df to contain all the information because the error interrupts me.

muschellij2 commented 5 years ago

You can put an if statement for res$total_results > 0

library(pacman)
p_load(rscopus,tidyverse)
tibble() -> all_df -> all_au -> all_aff

# "1051-0761" could not get results
issn_test = c("0169-5347","1051-0761","2051-1434")
i = 1
for(i in seq_along(issn_test)){
  res = scopus_search(query = str_c("ISSN(",issn_test[i],") AND  PUBYEAR = 2018"), 
                      max_count = Inf,
                      count = 25,
                      view = "COMPLETE")
  df = gen_entries_to_df(res$entries)
  if (res$total_results > 0) {

    df$df %>% 
      mutate(issn = issn_test[i]) %>% 
      bind_rows(all_df,.) -> all_df

    df$author %>% 
      mutate(issn = issn_test[i]) %>%  
      bind_rows(all_au,.) -> all_au

    df$affiliation %>% 
      mutate(issn = issn_test[i]) %>% 
      bind_rows(all_aff,.) -> all_aff
  }
}