ropensci / openalexR

Getting bibliographic records from OpenAlex
https://docs.ropensci.org/openalexR/
Other
89 stars 19 forks source link

oa2df fails to convert list to df #232

Closed peetmate closed 2 months ago

peetmate commented 2 months ago

Hi,

I think this should repeatable: https://github.com/CIAT/ERA_dev/blob/main/R/search/livestock_2024/search_terms.R

Running search i=4 (line 247:249)

oa2df fails to convert list created by oa_request to df

This is the error received:

Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE,  : 
  arguments imply differing number of rows: 3, 0
janpht commented 2 months ago

Hi there,

I have the same "differing number of rows"-issue, if running the following with oa_fetch:

library (openalexR)
library (tidyverse)

books <- oa_fetch(
  entity = "works", 
  from_publication_date = "2020-01-01",
  to_publication_date = "2023-12-31",
  type = "book",
  verbose = TRUE
)
mariusbommert commented 2 months ago

Hi,

I got a similar error message and it seems to be a problem with raw_affiliation_string in my case.

query <- oa_query(
  identifier = "https://openalex.org/W3081691125",
  entity = "works",
  endpoint = "https://api.openalex.org"
)

res <- oa_request(
  query_url = query,
  count_only = FALSE,
  verbose = FALSE
)

oa2df(res, entity = "works")

leads to

Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE,  : 
  arguments imply differing number of rows: 1, 0

I analysed the error and found out that the problem seems to occur in https://github.com/ropensci/openalexR/blob/main/R/oa2df.R line 229 or the rbind of this part if the raw_affiliation_string is empty. In my fork of this repo https://github.com/mariusbommert/openalexR/blob/main/R/oa2df.R line 244 I used

 au_affiliation_raw <- unlist(replace_w_na(l$raw_affiliation_strings[1]))

instead which seems to fix the problem. Maybe the problem arised due to OpenAlex deprecating raw_affiliation_string and using raw_affiliation_strings instead, see https://docs.openalex.org/api-entities/works/work-object/authorship-object.

In my fork of this repo I added options to get more than one institution and raw_affiliation_string per author. This is why I have a different number of lines of code.

janpht commented 2 months ago

What a response time! Thank you very much, it works well.

trangdata commented 2 months ago

I analysed the error and found out that the problem seems to occur [...] if the raw_affiliation_string is empty.

Thanks a lot @mariusbommert for investigating the issue. You're right, I needed to consider the NULL case for raw_affiliation_string. Fixed in #233.

rkrug commented 2 months ago

This might be a bigger task - but these conversions to dataframes seem to be a moving target and fragile to change in the OA structures. Could there be  more robust approach overall used for this conversion which works even after the changes and still gives a consistent answer? Sent from my iPhoneOn 26 Apr 2024, at 08:13, Trang Le @.***> wrote:

I analysed the error and found out that the problem seems to occur [...] if the raw_affiliation_string is empty.

Thanks a lot @mariusbommert for investigating the issue. You're right, I needed to consider the NULL case for raw_affiliation_string. Fixed in #233.

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.***>

peetmate commented 2 months ago

many thanks for reviewing this issue

manubrucco commented 6 days ago

Hello there! When I apply the solution proposed by mariusbommert, an error occurs

_Error in empty_list(inst_cols) : could not find function "emptylist"

Thank you in advance for any help you can provide!

trangdata commented 6 days ago

Hi @manubrucco could you re-install the package and try again? 🙏🏽

install.packages("openalexR")
manubrucco commented 6 days ago

@trangdata Thank you!! I'd updated the package before and didn't work, but now that I uninstalled it and installed again it worked!!