ropensci / openalexR

Getting bibliographic records from OpenAlex
https://docs.ropensci.org/openalexR/
Other
89 stars 19 forks source link

Fix oa_generate #224

Closed trangdata closed 3 months ago

trangdata commented 3 months ago
library(openalexR)
# Example 1: basic usage getting one record at a time
query_url <- "https://api.openalex.org/works?filter=cites%3AW1160808132"
oar <- oa_generate(query_url, verbose = TRUE)
p1 <- oar() # record 1
#> Getting record 1 of 398 records...
p2 <- oar() # record 2
#> Getting record 2 of 398 records...
p3 <- oar() # record 3
#> Getting record 3 of 398 records...
head(p1)
#> $id
#> [1] "https://openalex.org/W2766937672"
#> 
#> $doi
#> [1] "https://doi.org/10.1016/j.enpol.2017.10.050"
#> 
#> $title
#> [1] "How economic growth, renewable electricity and natural resources contribute to CO2 emissions?"
#> 
#> $display_name
#> [1] "How economic growth, renewable electricity and natural resources contribute to CO2 emissions?"
#> 
#> $publication_year
#> [1] 2018
#> 
#> $publication_date
#> [1] "2018-02-01"
head(p3)
#> $id
#> [1] "https://openalex.org/W2317269391"
#> 
#> $doi
#> [1] "https://doi.org/10.1016/j.renene.2016.03.078"
#> 
#> $title
#> [1] "Determinants of CO2 emissions in the European Union: The role of renewable and non-renewable energy"
#> 
#> $display_name
#> [1] "Determinants of CO2 emissions in the European Union: The role of renewable and non-renewable energy"
#> 
#> $publication_year
#> [1] 2016
#> 
#> $publication_date
#> [1] "2016-08-01"

# Example 2: using `coro::loop()` to iterate through the generator
query_url <- "https://api.openalex.org/works?filter=cited_by%3AW1847168837"
oar <- oa_generate(query_url)
coro::loop(for (x in oar) {
  print(x$id)
})
#> [1] "https://openalex.org/W4248184440"
#> [1] "https://openalex.org/W2063977404"
#> [1] "https://openalex.org/W1987736762"
#> [1] "https://openalex.org/W2127626512"
#> [1] "https://openalex.org/W2135732400"

# Example 3: save records in blocks of 100
query_url <- "https://api.openalex.org/works?filter=cites%3AW1160808132"
oar <- oa_generate(query_url)
n <- 100
recs <- vector("list", n)
i <- 0

coro::loop(for (x in oar) {
  j <- i %% n + 1
  recs[[j]] <- x
  if (j == n) {
    # saveRDS(recs, sprintf("rec-%s.rds", i %/% n))
    recs <- vector("list", n) # reset recs
  }
  i <- i + 1
})
head(x)
#> $id
#> [1] "https://openalex.org/W4200579751"
#> 
#> $doi
#> [1] "https://doi.org/10.31679/adamakademi.981595"
#> 
#> $title
#> [1] "Çevresel Kuznets Eğrisi Hipotezinin İslami ve İslami Olmayan Gelişmekte Olan Ekonomiler İçin Karşılaştırmalı Olarak Analizi"
#> 
#> $display_name
#> [1] "Çevresel Kuznets Eğrisi Hipotezinin İslami ve İslami Olmayan Gelişmekte Olan Ekonomiler İçin Karşılaştırmalı Olarak Analizi"
#> 
#> $publication_year
#> [1] 2021
#> 
#> $publication_date
#> [1] "2021-12-30"
j
#> [1] 98
# 398 works total, so j = 98 makes sense.

Created on 2024-03-23 with reprex v2.0.2

yjunechoe commented 3 months ago

@trangdata Maybe this is a locale/encoding thing, but this test (and only this test), I have some weird unicode characters popping up for the display name that makes test() fail:

https://github.com/ropensci/openalexR/blob/71adc6dff8b793a5c71939600e6d5da6faad163c/tests/testthat/test-oa2df.R#L10-L11

openalexR::oa_fetch(identifier = "S62468778")$display_name
#> [1] "\u0098The \u009cNew England journal of medicine (Print)"

Could you see if you can reproduce this? Otherwise LGTM, so if it's just me I'll merge for now!

rkrug commented 3 months ago

Same for me. Also on OpenAlex:

On 4 Apr 2024, at 17:03, June Choe @.***> wrote:

@trangdata https://github.com/trangdata Maybe this is a locale/encoding thing, but this test (and only this test), I have some weird unicode characters popping up for the display name that makes test() fail:

https://github.com/ropensci/openalexR/blob/71adc6dff8b793a5c71939600e6d5da6faad163c/tests/testthat/test-oa2df.R#L10-L11

openalexR::oa_fetch(identifier = "S62468778")$display_name

> [1] "\u0098The \u009cNew England journal of medicine (Print)"

Could you see if you can reproduce this? Otherwise LGTM, so if it's just me I'll merge for now!

— Reply to this email directly, view it on GitHub https://github.com/ropensci/openalexR/pull/224#issuecomment-2037462293, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADW6BHXTU7APFDYRXSAMZDY3VTVBAVCNFSM6AAAAABFEZCPW2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMZXGQ3DEMRZGM. You are receiving this because you are subscribed to this thread.

-- Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology, UCT), Dipl. Phys. (Germany)

Orcid ID: 0000-0002-7490-0066

Department of Evolutionary Biology and Environmental Studies University of Zürich Office Y19-M-72 Winterthurerstrasse 190 8075 Zürich Switzerland

Office: +41 (0)44 635 47 64 Cell: +41 (0)78 630 66 57 email: @. @.

PGP: 0x0F52F982

trangdata commented 3 months ago

Very strange! Thank you @yjunechoe for pointing this out. There must have been a change on the OpenAlex's end. https://api.openalex.org/sources/S62468778

I changed the test and made an edit in sources2df.