ropensci / openalexR

Getting bibliographic records from OpenAlex
https://docs.ropensci.org/openalexR/
Other
89 stars 19 forks source link

oa_fetch( ) no longer returns data #164

Closed Ifeanyi55 closed 10 months ago

Ifeanyi55 commented 10 months ago

Since the update to the openalexR library, I have not been able to fetch data anymore via the api. Formerly, this was the code that I used to get data:

works_search <- oa_fetch(
  entity = "works",
  title.search = c("simulation", "science mapping"),
  cited_by_count = ">50",
  from_publication_date = "2020-01-01",
  to_publication_date = "2022-12-31",
  options = list(sort = "cited_by_count:desc"),
  verbose = TRUE
)

But since the August 8, 2023 software update, this code now breaks and there is no update in the documentation about the new way to fetch data over the api.

I need to figure this out fast as I was working on a project based on OpenAlex before changes were made to the library.

yjunechoe commented 10 months ago

Could you give us a little more info? Are there any errors/warnings/messages associated with the breakage?

FWIW the code works fine for me on the dev version:

library(openalexR)
#> Thank you for using openalexR!
#> To acknowledge our work, please cite the package by calling
#> `citation("openalexR")`.
works_search <- oa_fetch(
  entity = "works",
  title.search = c("simulation", "science mapping"),
  cited_by_count = ">50",
  from_publication_date = "2020-01-01",
  to_publication_date = "2022-12-31",
  options = list(sort = "cited_by_count:desc"),
  verbose = TRUE
)
#> Requesting url: https://api.openalex.org/works?filter=title.search%3Asimulation%7Cscience%20mapping%2Ccited_by_count%3A%3E50%2Cfrom_publication_date%3A2020-01-01%2Cto_publication_date%3A2022-12-31&sort=cited_by_count%3Adesc
#> Getting 4 pages of results with a total of 714 records...
works_search
#> # A tibble: 714 × 36
#>    id                     display_name author ab    publication_date so    so_id
#>    <chr>                  <chr>        <list> <chr> <chr>            <chr> <chr>
#>  1 https://openalex.org/… LAMMPS - a … <df>   "Sin… 2022-02-01       Comp… http…
#>  2 https://openalex.org/… Predicting … <df>   "Epi… 2020-04-01       Tran… http…
#>  3 https://openalex.org/… Petroleum R… <df>   ""    2020-01-01       Else… http…
#>  4 https://openalex.org/… Simulation … <df>   "Abs… 2020-03-26       Clin… http…
#>  5 https://openalex.org/… TURBOMOLE: … <df>   "Abs… 2020-05-14       Jour… http…
#>  6 https://openalex.org/… DFTB+, a so… <df>   "DFT… 2020-03-23       Jour… http…
#>  7 https://openalex.org/… Simulation … <df>   ""    2020-03-18       Natu… http…
#>  8 https://openalex.org/… Machine Lea… <df>   "Mac… 2020-04-20       Annu… http…
#>  9 https://openalex.org/… PENELOPE: A… <df>   "The… 2020-03-25       OECD… http…
#> 10 https://openalex.org/… Special rep… <df>   "How… 2020-04-02       Natu… http…
#> # ℹ 704 more rows
#> # ℹ 29 more variables: host_organization <chr>, issn_l <chr>, url <chr>,
#> #   pdf_url <chr>, license <chr>, version <chr>, first_page <chr>,
#> #   last_page <chr>, volume <chr>, issue <chr>, is_oa <lgl>,
#> #   is_oa_anywhere <lgl>, oa_status <chr>, oa_url <chr>,
#> #   any_repository_has_fulltext <lgl>, language <chr>, grants <list>,
#> #   cited_by_count <int>, counts_by_year <list>, publication_year <int>, …
Ifeanyi55 commented 10 months ago

Thanks June. Here is the error message I have been getting ever since whenever I run the code:

Requesting url: https://api.openalex.org/works?filter=title.search%3Asimulation%7Cscience%20mapping%2Ccited_by_count%3A%3E50%2Cfrom_publication_date%3A2020-01-01%2Cto_publication_date%3A2022-12-31&sort=cited_by_count%3Adesc
Getting 4 pages of results with a total of 714 records...
  OpenAlex downloading [=====================] 100% eta:  0s
Error in (function (..., deparse.level = 1, make.row.names = TRUE, stringsAsFactors = FALSE,  : 
  invalid list argument: all variables should have the same length

Even after I updated the software, the problems persists.

yjunechoe commented 10 months ago

Hmm. That looks like same error that some other folks reported recently, which has been fixed since.

Even after I updated the software, the problems persists.

What version of the package are you using?

packageVersion('openalexR')
#> [1] '1.2.1'

Can you also try installing the dev version from github?

#install.packages("remotes")
remotes::install_github("ropensci/openalexR")
Ifeanyi55 commented 10 months ago

Thanks! Installing the dev version works. When it stopped working, I updated the software from CRAN, but that didn't solve the issue. Furthermore, I noticed that the parameters of the updated CRAN version had changed as well. I hope this dev version will remain stable for production use.

yjunechoe commented 10 months ago

Good!

Just FYI, we didn't introduce any breaking changes. The OpenAlex API releases new features time to time and sometimes they don't come in the format we expected (which is reasonable but {openalexR} is a 3rd party wrapper so we don't really get a heads up). The package is stable and well maintained - just make sure to keep the package version up to date!

Ifeanyi55 commented 10 months ago

Yes, I plan to keep updating it with the dev version. Thanks again!