Closed oharac closed 4 years ago
Thanks - taxonomy is a deep dark hole from which many weird taxonomic ranks emerge from time to time. one of the taxa had the rank "epifamily" https://www.marinespecies.org/aphia.php?p=taxdetails&id=1459303
fixed, if you reinstall it should work
awesome - thanks so much! If I run into another odd rank in the next round of searching I will post it here.
On Mon, Sep 28, 2020 at 4:28 PM Scott Chamberlain notifications@github.com wrote:
Thanks - taxonomy is a deep dark hole from which many weird taxonomic ranks emerge from time to time. one of the taxa had the rank "epifamily" https://www.marinespecies.org/aphia.php?p=taxdetails&id=1459303
fixed, if you reinstall it should work
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ropensci/taxize/issues/847#issuecomment-700336045, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACRXZV4PO2ACAGOQRKIT5XLSIEL3PANCNFSM4R47GSXQ .
thanks
I encountered this same error with the WoRMS database again, but seems to be for a different reason. taxize::downstream
for family Polynoidae (id 939) returns:
downstream(939, db = 'worms', downto = 'species')[[1]]
# Error in vapply(x$rank, function(z) which_rank(z, zoo = zoo), 1) :
# values must be length 1,
# but FUN(X[[376]]) result is length 0
Knowing that prior issues were due to oddball ranks, so I checked the downstream ranks. Here the problem is an NA
rank, caused by a null
rank listed in the output from the AphiaChildrenByAphiaID
API endpoint. The ones I've found so far are children of ID 129496 though I have not done an exhaustive search so there may be others as well. Here is part of the record for one example, ID 333822, as retrieved from https://www.marinespecies.org/rest/AphiaChildrenByAphiaID/129496?marine_only=true&offset=95:
"AphiaID": 333822,
"url": "https://www.marinespecies.org/aphia.php?p=taxdetails&id=333822",
"scientificname": "Lepidonotus pellucidus",
"authority": "Dyster in Johnston, 1865",
"status": "accepted",
"unacceptreason": null,
"taxonRankID": 220,
"rank": null,
"valid_AphiaID": 333822,
"valid_name": "Lepidonotus pellucidus",
However, when accessing this species in the other direction, using the AphiaClassificationByAphiaID
endpoint (https://www.marinespecies.org/rest/AphiaClassificationByAphiaID/333822), the API seems to return the rank as "Species" as expected. This seems to be an issue on the WoRMS end (and I emailed them to point it out), but in the mean time perhaps there's a graceful way to handle the NA
rank value in taxize::downstream()
without throwing an error. Thanks!
Thanks for the report.
Unfortunately, there's no way to handle missing ranks really, other than perhaps making additional http requests for every single name that does not have a rank, which seems like a mess and I'd rather avoid doing that.
For now, I'm changing (reinstall to get change) the code to change missing ranks for WORMS to "no rank" (which NCBI has a lot of), and then the existing code handles the "no rank" already. "no rank" taxa are dropped in most cases. The errors are coming from the prune_too_low
function https://github.com/ropensci/taxize/blob/master/R/downstream-utils.R#L9 where we drop any taxa that have ranks lower than the target rank.
I emailed the WoRMS folks and their response was that they couldn't replicate the null
rank thing - so checking today, I can't replicate it either - I guess it was an intermittent problem (though I could replicate it on the day I posted the issue).
Thanks for the follow up. Well glad it was an intermittent thing; hopefully it doesn't come back.
A new instance of the zero-length error in WoRMS downstream
:
downstream(345465, db = 'worms', downto = 'class', marine_only = FALSE)[[1]]
In case this is a similar problem to those noted before, where odd taxonomic ranks would create this error, I checked the children of this sequentially to identify any unusual ranks.
children(345465, db = 'worms', marine_only = FALSE)[[1]]
returned a couple of "Subphylum" ranks.children(588641, db = 'worms', marine_only = FALSE)[[1]]
returned a couple of "Infraphylum" ranks.children(369192, db = 'worms', marine_only = FALSE)[[1]]
returned 151 instances where the classification skips from "Subphylum" (369192) all the way down to "Genus" in one step, which seems odd.thanks! will have a look
@oharac should be fixed now. the missing rank was infraphylum
EDITED...
getting back into this project, ran across this error again
Error in vapply(x$rank, function(z) which_rank(z, zoo = zoo), 1) :
values must be length 1,
but FUN(X[[53]]) result is length 0
Reprex:
library(taxize)
downstream(sci_id = 1821, db = 'worms', downto = 'class')
#> Error in vapply(x$rank, function(z) which_rank(z, zoo = zoo), 1): values must be length 1,
#> but FUN(X[[3]]) result is length 0
Created on 2021-08-20 by the reprex package (v1.0.0)
Sequential calls to children
showed where the code seemed to be choking. I wonder if these ranks need to be added to the rank_ref_zoo?
parvphylum, megaclass, gigaclass
More reprex:
library(taxize)
### chokes on 1821:
downstream(1821, db = 'worms', downto = 'class')
#> Error in vapply(x$rank, function(z) which_rank(z, zoo = zoo), 1): values must be length 1,
#> but FUN(X[[3]]) result is length 0
children(sci_id = 1821, db = 'worms')
#> $`1821`
#> # A tibble: 4 x 3
#> childtaxa_id childtaxa_name childtaxa_rank
#> <int> <chr> <chr>
#> 1 1824 Cephalochordata Subphylum
#> 2 146420 Tunicata Subphylum
#> 3 1822 Urochordata Subphylum
#> 4 146419 Vertebrata Subphylum
#>
#> attr(,"class")
#> [1] "children"
#> attr(,"db")
#> [1] "worms"
### chokes on subphylum Vertebrata:
downstream(146419, downto = 'class', db = 'worms')
#> Error in vapply(x$rank, function(z) which_rank(z, zoo = zoo), 1): values must be length 1,
#> but FUN(X[[3]]) result is length 0
children(146419, db = 'worms')
#> $`146419`
#> # A tibble: 2 x 3
#> childtaxa_id childtaxa_name childtaxa_rank
#> <int> <chr> <chr>
#> 1 1829 Agnatha Infraphylum
#> 2 1828 Gnathostomata Infraphylum
#>
#> attr(,"class")
#> [1] "children"
#> attr(,"db")
#> [1] "worms"
### chokes on infraphylum Gnathostomata:
downstream(1828, downto = 'class', db = 'worms')
#> Error in vapply(x$rank, function(z) which_rank(z, zoo = zoo), 1): values must be length 1,
#> but FUN(X[[1]]) result is length 0
children(1828, db = 'worms')
#> $`1828`
#> # A tibble: 4 x 3
#> childtaxa_id childtaxa_name childtaxa_rank
#> <int> <chr> <chr>
#> 1 1517375 Chondrichthyes Parvphylum
#> 2 152352 Osteichthyes Parvphylum
#> 3 11676 Pisces Superclass
#> 4 1831 Tetrapoda Megaclass
#>
#> attr(,"class")
#> [1] "children"
#> attr(,"db")
#> [1] "worms"
### chokes on parvphylum Osteichthyes
downstream(152352, downto = 'class', db = 'worms')
#> Error in vapply(x$rank, function(z) which_rank(z, zoo = zoo), 1): values must be length 1,
#> but FUN(X[[1]]) result is length 0
children(152352, db = 'worms')
#> $`152352`
#> # A tibble: 2 x 3
#> childtaxa_id childtaxa_name childtaxa_rank
#> <int> <chr> <chr>
#> 1 10194 Actinopterygii Gigaclass
#> 2 163509 Sarcopterygii Gigaclass
#>
#> attr(,"class")
#> [1] "children"
#> attr(,"db")
#> [1] "worms"
### finally is OK at this stage
downstream(10194, downto = 'class', db = 'worms')
#> $`10194`
#> id name rank
#> 1 843664 Actinopteri class
#>
#> attr(,"class")
#> [1] "downstream"
#> attr(,"db")
#> [1] "worms"
downstream(163509, downto = 'class', db = 'worms')
#> $`163509`
#> id name rank
#> 1 843665 Coelacanthi class
#>
#> attr(,"class")
#> [1] "downstream"
#> attr(,"db")
#> [1] "worms"
# both OK
Created on 2021-08-21 by the reprex package (v1.0.0)
Thanks for the info! I will look into this and see about adding those ranks.
Sorry for the delay. I have added the ranks and made the error message better.
You can try out the change by installing this version that will be pushed to CRAN soon hopefully, but note this version has many other changes and might break other code.
install.packages("remotes")
remotes::install_github("ropensci/taxize")
Hi,
I'm finding this package to be really useful, but I'm running into a bug. I am using
taxize::downstream
to access the WoRMS database to get all families related to a set of specific orders. For nearly everything, it works fine, but for decapoda (1130) and amphipoda (1135) it returns this error:EDIT: I see that this is similar to #821 and #824 - those were related to a problem with rank name - perhaps something similar happening here?
Reproducible example:
Session Info
```r R version 3.6.3 (2020-02-29) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 18.04.5 LTS Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3 LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.so locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 [4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C [10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] taxize_0.9.98.91 loaded via a namespace (and not attached): [1] Rcpp_1.0.5 ape_5.4-1 lattice_0.20-41 prettyunits_1.1.1 ps_1.3.3 [6] zoo_1.8-8 assertthat_0.2.1 rprojroot_1.3-2 digest_0.6.25 foreach_1.5.0 [11] R6_2.4.1 plyr_1.8.6 backports_1.1.8 RSQLite_2.2.0 pillar_1.4.6 [16] rlang_0.4.7 curl_4.3 uuid_0.1-4 rstudioapi_0.11 data.table_1.13.0 [21] callr_3.4.3 blob_1.2.1 worrms_0.4.2 desc_1.2.0 urltools_1.7.3 [26] devtools_2.3.0 stringr_1.4.0 bit_4.0.4 triebeard_0.3.0 compiler_3.6.3 [31] xfun_0.14 pkgconfig_2.0.3 pkgbuild_1.0.8 conditionz_0.1.0 tidyselect_1.1.0 [36] tibble_3.0.3 httpcode_0.3.0 codetools_0.2-16 reshape_0.8.8 fansi_0.4.1 [41] crayon_1.3.4 dplyr_1.0.2 hoardr_0.5.2 dbplyr_1.4.4 withr_2.2.0 [46] rappdirs_0.3.1 crul_1.0.0 grid_3.6.3 nlme_3.1-148 jsonlite_1.7.1 [51] lifecycle_0.2.0 DBI_1.1.0 magrittr_1.5 taxizedb_0.2.2.93 cli_2.0.2 [56] stringi_1.5.3 fs_1.4.1 remotes_2.1.1 testthat_2.3.2 xml2_1.3.2 [61] ellipsis_0.3.1 generics_0.0.2 vctrs_0.3.4 iterators_1.0.12 tools_3.6.3 [66] bold_1.1.0 bit64_4.0.5 glue_1.4.2 purrr_0.3.4 processx_3.4.2 [71] pkgload_1.1.0 parallel_3.6.3 sessioninfo_1.1.1 memoise_1.1.0 knitr_1.28 [76] usethis_1.6.1 ```