ropensci / taxize

A taxonomic toolbelt for R
https://docs.ropensci.org/taxize
Other
264 stars 58 forks source link

continue ncbi_downstream() for loop even when one taxon generates error? #923

Open nvpatin opened 6 months ago

nvpatin commented 6 months ago

I am trying to retrieve NCBI taxids for all genomes in a particular phylum. It seems that downstream() only works for one level below the given taxid, so I wrote some for loops to collect the information from each subsequent taxonomic level. When I get to the species level, I get "Error: Bad Request (HTTP 400)" and I can link this to a small number of specific species.

Is there a way to ensure the taxids still get downloaded for the downstream() taxa of interest, even if some of them generate the error? My code is below along with the final error and session info.

phylum = "Mollusca"

phy_ids <- as.data.frame(get_ids(phylum, db='ncbi')$ncbi)[1]

class_ids <- ncbi_downstream(id=phy_ids[1], downto="class")$childtaxa_id

order_ids <- list()

for (x in class_ids) {
  orders_tmp <- ncbi_downstream(id=x, downto="order")
  order_ids <- append(order_ids, list(orders_tmp))
}

family_ids <- list()

for (x in order_ids) {
  fam_tmp <- ncbi_downstream(id=c(x$childtaxa_id), downto="family")
  family_ids <- append(family_ids, list(fam_tmp))
}

genus_ids <- list()

for (x in family_ids) {
  gen_tmp <- ncbi_downstream(id=c(x$childtaxa_id), downto="genus")
  genus_ids <- append(genus_ids, list(gen_tmp))
}

species_ids <- list()

for (x in genus_ids) {
  spp_tmp <- ncbi_downstream(id=c(x$childtaxa_id), downto="species")
  species_ids <- append(species_ids, list(spp_tmp))
}

Error: Bad Request (HTTP 400)

Session Info ``` > devtools::session_info() ─ Session info ────────────────────────────────────────────────────────────────── setting value version R version 4.3.2 (2023-10-31) os macOS Ventura 13.4.1 system x86_64, darwin22.6.0 ui RStudio language (EN) collate en_US.UTF-8 ctype en_US.UTF-8 tz Europe/Zurich date 2023-12-18 rstudio 2023.06.1+524 Mountain Hydrangea (desktop) pandoc 3.1.8 @ /usr/local/bin/pandoc ─ Packages ────────────────────────────────────────────────────────────────────── package * version date (UTC) lib source ape 5.7-1 2023-03-13 [1] CRAN (R 4.3.1) bold 1.3.0 2023-05-02 [1] CRAN (R 4.3.2) cachem 1.0.8 2023-05-01 [1] CRAN (R 4.3.1) callr 3.7.3 2022-11-02 [1] CRAN (R 4.3.1) cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.1) codetools 0.2-19 2023-02-01 [2] CRAN (R 4.3.2) colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.3.1) conditionz 0.1.0 2019-04-24 [1] CRAN (R 4.3.2) crayon 1.5.2 2022-09-29 [1] CRAN (R 4.3.1) crul 1.4.0 2023-05-17 [1] CRAN (R 4.3.2) curl 5.1.0 2023-10-02 [1] CRAN (R 4.3.1) data.table 1.14.8 2023-02-17 [1] CRAN (R 4.3.1) devtools * 2.4.5 2022-10-11 [1] CRAN (R 4.3.1) digest 0.6.33 2023-07-07 [1] CRAN (R 4.3.1) dplyr 1.1.3 2023-09-03 [1] CRAN (R 4.3.1) ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.3.1) fansi 1.0.5 2023-10-08 [1] CRAN (R 4.3.1) fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.1) foreach 1.5.2 2022-02-02 [1] CRAN (R 4.3.1) fs 1.6.3 2023-07-20 [1] CRAN (R 4.3.1) generics 0.1.3 2022-07-05 [1] CRAN (R 4.3.1) ggplot2 3.4.4 2023-10-12 [1] CRAN (R 4.3.1) glue 1.6.2 2022-02-24 [1] CRAN (R 4.3.1) gtable 0.3.4 2023-08-21 [1] CRAN (R 4.3.1) htmltools 0.5.7 2023-11-03 [1] CRAN (R 4.3.2) htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 4.3.1) httpcode 0.3.0 2020-04-10 [1] CRAN (R 4.3.2) httpuv 1.6.12 2023-10-23 [1] CRAN (R 4.3.1) iterators 1.0.14 2022-02-05 [1] CRAN (R 4.3.1) jsonlite 1.8.7 2023-06-29 [1] CRAN (R 4.3.1) later 1.3.1 2023-05-02 [1] CRAN (R 4.3.1) lattice 0.22-5 2023-10-24 [2] CRAN (R 4.3.2) lifecycle 1.0.4 2023-11-07 [1] CRAN (R 4.3.1) magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.1) memoise 2.0.1 2021-11-26 [1] CRAN (R 4.3.1) mime 0.12 2021-09-28 [1] CRAN (R 4.3.1) miniUI 0.1.1.1 2018-05-18 [1] CRAN (R 4.3.1) munsell 0.5.0 2018-06-12 [1] CRAN (R 4.3.1) nlme 3.1-163 2023-08-09 [2] CRAN (R 4.3.2) pillar 1.9.0 2023-03-22 [1] CRAN (R 4.3.1) pkgbuild 1.4.2 2023-06-26 [1] CRAN (R 4.3.1) pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.3.1) pkgload 1.3.3 2023-09-22 [1] CRAN (R 4.3.1) prettyunits 1.2.0 2023-09-24 [1] CRAN (R 4.3.1) processx 3.8.2 2023-06-30 [1] CRAN (R 4.3.1) profvis 0.3.8 2023-05-02 [1] CRAN (R 4.3.1) promises 1.2.1 2023-08-10 [1] CRAN (R 4.3.1) ps 1.7.5 2023-04-18 [1] CRAN (R 4.3.1) purrr 1.0.2 2023-08-10 [1] CRAN (R 4.3.1) R6 2.5.1 2021-08-19 [1] CRAN (R 4.3.1) Rcpp 1.0.11 2023-07-06 [1] CRAN (R 4.3.1) remotes 2.4.2.1 2023-07-18 [1] CRAN (R 4.3.1) rlang 1.1.2 2023-11-04 [1] CRAN (R 4.3.1) rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.1) scales 1.2.1 2022-08-20 [1] CRAN (R 4.3.1) sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.1) shiny 1.7.5.1 2023-10-14 [1] CRAN (R 4.3.1) stringi 1.8.1 2023-11-13 [1] CRAN (R 4.3.1) stringr 1.5.0 2022-12-02 [1] CRAN (R 4.3.1) taxize * 0.9.100 2022-04-22 [1] CRAN (R 4.3.2) tibble 3.2.1 2023-03-20 [1] CRAN (R 4.3.1) tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.3.1) triebeard 0.4.1 2023-03-04 [1] CRAN (R 4.3.2) urlchecker 1.0.1 2021-11-30 [1] CRAN (R 4.3.1) urltools 1.7.3 2019-04-14 [1] CRAN (R 4.3.2) usethis * 2.2.2 2023-07-06 [1] CRAN (R 4.3.1) utf8 1.2.4 2023-10-22 [1] CRAN (R 4.3.1) uuid 1.1-1 2023-08-17 [1] CRAN (R 4.3.2) vctrs 0.6.4 2023-10-12 [1] CRAN (R 4.3.1) xml2 1.3.5 2023-07-06 [1] CRAN (R 4.3.1) xtable 1.8-4 2019-04-21 [1] CRAN (R 4.3.1) zoo 1.8-12 2023-04-13 [1] CRAN (R 4.3.1) [1] /usr/local/lib/R/4.3/site-library [2] /usr/local/Cellar/r/4.3.2/lib/R/library ```