ropensci / openalexR

Getting bibliographic records from OpenAlex
https://docs.ropensci.org/openalexR/
Other
91 stars 20 forks source link

Add new entities: funders, sources, publishers #130

Closed trangdata closed 1 year ago

trangdata commented 1 year ago

Closes #124

trangdata commented 1 year ago

@maelle Have you seen this error for other ropensci packages? This is in our check-pkgdown.yml file. https://github.com/ropensci/openalexR/blob/66f07433b5efbdff16c581da9fdb754fb649fb4b/.github/workflows/check-pkgdown.yml#L28

Run devtools::install_github("https://github.com/ropensci-org/rotemplate")

[debug]/usr/local/bin/Rscript /home/runner/work/_temp/ea1f9bee-1054-454b-8bf6-6055f1296b5d

Using github PAT from envvar GITHUB_PAT Error: Error: Failed to install 'rotemplate' from GitHub: SSL peer certificate or SSH remote key was not OK: [api.github.com] SSL: no alternative certificate subject name matches target host name 'api.github.com' Execution halted Error: Process completed with exit code 1.

[debug]Finishing: Install dependencies

maelle commented 1 year ago

I haven't seen this error but I think your workflow can just skip installing rotemplate, only pkgdown is needed.

yjunechoe commented 1 year ago

Also apparently I can't comment on unchanged files in my review, but we should also edit the id_type() internal that oa_fetch() uses to guess entity type if only identifier is provided:

https://github.com/ropensci/openalexR/blob/66f07433b5efbdff16c581da9fdb754fb649fb4b/R/utils.R#L59-L68

And maybe worth adding the failing ones here to tests once that's implemented:

# These work:
i <- oa_fetch(identifier = "I4200000001") # Institution
a <- oa_fetch(identifier = "A1969205032") # Author

# New ones don't yet:
f <- oa_fetch(identifier = "F4320332161") # Funder
#> Error in match.arg(entity, oa_entities()): 'arg' must be NULL or a character vector
p <- oa_fetch(identifier = "P4310311775") # Publisher
#> Error in match.arg(entity, oa_entities()): 'arg' must be NULL or a character vector
s <- oa_fetch(identifier = "S1983995261") # Source
#> Error in match.arg(entity, oa_entities()): 'arg' must be NULL or a character vector
trangdata commented 1 year ago

Ready for next round of review @yjunechoe 🙏🏽

yjunechoe commented 1 year ago

All looks good to me! - You can go ahead and merge

One thing to watch out for which I think might just be on the API end is that multiple entity search works for Source but not Funder and Publisher. Maybe they haven't implemented the OR filter by ID for them yet:

## These work
w2 <- oa_fetch(identifier = c("W2100837269", "W1775749144"))
s2 <- oa_fetch(identifier = c("S2764455111", "S4306400806"))

## These don't work
f2 <- oa_fetch(identifier = c("F4320332161", "F4320306076"))
#> Error: OpenAlex API request failed [403]
#> Invalid query parameters error.
#> <openalex_id is not a valid field. Valid fields are underscore or hyphenated versions of: cited_by_count, continent, country_code, default.search, description.search, display_name, display_name.search, from_created_date, grants_count, ids.crossref, ids.doi, ids.openalex, ids.ror, ids.wikidata, is_global_south, openalex, roles.id, ror, summary_stats.2yr_mean_citedness, summary_stats.h_index, summary_stats.i10_index, wikidata, works_count>
p2 <- oa_fetch(identifier = c("P4310311775", "P4310320990"))
#> Error: OpenAlex API request failed [403]
#> Invalid query parameters error.
#> <openalex_id is not a valid field. Valid fields are underscore or hyphenated versions of: cited_by_count, continent, country_codes, default.search, display_name, display_name.search, from_created_date, hierarchy_level, ids.openalex, ids.ror, ids.wikidata, lineage, openalex, parent_publisher, roles.id, ror, summary_stats.2yr_mean_citedness, summary_stats.h_index, summary_stats.i10_index, wikidata, works_count>

For example, the code for f2 above generates this endpoint which returns an error:

But the code for s2 also uses the same syntax for Source entities and this works fine:

Maybe OpenAlex will catch up on these soon :)

trangdata commented 1 year ago

Thank you for pointing this out, @yjunechoe! I notified the OpenAlex team. Hopefully, they'll fix this soon. 🤞🏽

h1-the-swan commented 1 year ago

OpenAlex team member here. I have added the "openalex_id" filter to funders and publishers, which should fix this issue. Just FYI, though, I believe that that filter has been deprecated in favor of "openalex" (e.g., https://api.openalex.org/funders?filter=openalex:https://openalex.org/F4320306076), which is probably why it wasn't included in those newer entities.