Closed rkrug closed 1 year ago
Thanks - I can replicate this error. This happens because one of the edges connect to a paper that is not present in nodes:
library(openalexR)
library(tidygraph)
ids <- c("W1896013598", "W312683970", "W2084630927")
ilk_snowball <- oa_snowball(
identifier = ids,
verbose = FALSE
)
edge_to_matches <- match(ilk_snowball$edges$to, ilk_snowball$nodes$id)
unmatched <- ilk_snowball$edges[which(is.na(edge_to_matches)), ]$to
unmatched
#> [1] "W1488199547"
For now the tbl_graph conversion works if you remove that edge that points to a missing node:
ilk_snowball$edges <- ilk_snowball$edges[ilk_snowball$edges$to != unmatched, ]
as_tbl_graph(ilk_snowball)
#> # A tbl_graph: 301 nodes and 298 edges
#> #
#> # A rooted forest with 3 trees
#> #
#> # A tibble: 301 × 31
#> id display_name author ab publication_date so so_id host_organization
#> <chr> <chr> <list> <chr> <chr> <chr> <chr> <chr>
#> 1 W189… Reforesting… <df> "" 2009-09-01 AMBI… http… Royal Swedish Ac…
#> 2 W312… Does outmig… <df> "In … 2015-08-01 Appl… http… Elsevier BV
#> 3 W208… Lake victor… <df> "The… 2004-05-01 Limn… http… Elsevier BV
#> 4 W210… Classical b… <df> "Of … 2010-08-01 Biol… http… Elsevier BV
#> 5 W198… Payments fo… <df> "Rec… 2012-05-01 Geof… http… Elsevier BV
#> 6 W288… The role of… <df> "Inv… 2019-01-01 Jour… http… Elsevier BV
#> # ℹ 295 more rows
#> # ℹ 23 more variables: issn_l <chr>, url <chr>, pdf_url <chr>, license <chr>,
#> # version <chr>, first_page <chr>, last_page <chr>, volume <chr>,
#> # issue <chr>, is_oa <lgl>, cited_by_count <int>, counts_by_year <list>,
#> # publication_year <int>, cited_by_api_url <chr>, ids <list>, doi <chr>,
#> # type <chr>, referenced_works <list>, related_works <list>,
#> # is_paratext <lgl>, is_retracted <lgl>, concepts <list>, oa_input <lgl>
#> #
#> # A tibble: 298 × 2
#> from to
#> <int> <int>
#> 1 4 3
#> 2 5 1
#> 3 6 3
#> # ℹ 295 more rows
Turns out that the missing node W1488199547
is a Deleted Work, so may have gotten filtered out early:
openalexR::oa_fetch(identifier = "W1488199547")[,1:2]
#> # A tibble: 1 × 2
#> id display_name
#> <chr> <chr>
#> 1 https://openalex.org/W4285719527 Deleted Work
This is confusing all-in-all so maybe oa_snowball
should at least check for validity between nodes and edges. Anyways, thanks for the report!
I agree. Thanks for looking into this.
Hi I am trying to use snowball and the graphing using
tidygraph
andggraph
on a set of (at the moment) 3 works (ids is the example). But I get the error below. If I exclude the first work, it works. So I assume, the results returned from OpenAlex are some info missing? Any ideas? Thanks,Rainer
I am unsure, if it is