ropensci / taxize

A taxonomic toolbelt for R
https://docs.ropensci.org/taxize
Other
264 stars 58 forks source link

Error in { : task 1 failed - "Bad Gateway (HTTP 502)" (TROPICOS) #930

Open ggrittz opened 2 months ago

ggrittz commented 2 months ago

Hi guys.

I'm having an issue similar to this, but using taxize::gnr_resolve with argument data_source_ids = 165 (Tropicos db). My input data is a vector of size 12,805 that I'm already splitting into 10 chunks for parallelizing:

#Solving name issues with Tropicos
library(parallel)
library(doParallel)
library(foreach)

#Set up parallel backend
num_cores <- detectCores() - 2
cl <- makeCluster(num_cores)
registerDoParallel(cl)

#Define a function to match names to Tropicos db (ID = 165)
match_names <- function(data_chunk, data_source_ids = 165) {
  taxize::gnr_resolve(sci = data_chunk,
                      data_source_ids = 165)
}

#Partition the dataset into smaller chunks
num_chunks <- num_cores  #One chunk for each core
dataset_chunks <- split(to_tps$input_full_name, 1:num_chunks)

#Parallelize the matching process for each chunk
tps_data <- foreach(chunk = dataset_chunks,
                    .combine = rbind,
                    .packages = "taxize",
                    .export = "dataset_chunks") %dopar% {
                      match_names(chunk, data_source_ids)
                   }

#Stop multicore processing
stopCluster(cl)

Not sure if relevant but my notebook has 12 cores. I'm using 10 to parallelize. If it's still an issue with Tropicos db I'll try other solution


R version 4.3.2 (2023-10-31 ucrt) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 11 x64 (build 22631) Matrix products: default locale: [1] LC_COLLATE=Portuguese_Brazil.utf8 LC_CTYPE=Portuguese_Brazil.utf8 LC_MONETARY=Portuguese_Brazil.utf8 [4] LC_NUMERIC=C LC_TIME=Portuguese_Brazil.utf8 time zone: America/Sao_Paulo tzcode source: internal attached base packages: [1] parallel stats graphics grDevices utils datasets methods base other attached packages: [1] Taxonomy_0.0.0.9000 testthat_3.2.1 dplyr_1.1.4 doParallel_1.0.17 iterators_1.0.14 foreach_1.5.2 loaded via a namespace (and not attached): [1] countrycode_1.5.0 splines_4.3.2 later_1.3.2 rnaturalearth_1.0.1 urltools_1.7.3 [6] fields_15.2 tibble_3.2.1 triebeard_0.4.1 FuzzyNumbers.Ext.2_3.2 XML_3.99-0.16.1 [11] rpart_4.1.21 lifecycle_1.0.4 sf_1.0-15 bold_1.3.0 rprojroot_2.0.4 [16] GenSA_1.1.14 NLP_0.2-1 globals_0.16.2 lattice_0.21-9 MASS_7.3-60 [21] crosstalk_1.2.1 magrittr_2.0.3 remotes_2.4.2.1 httpuv_1.6.14 leafgl_0.1.1 [26] spam_2.10-0 sp_2.1-3 sessioninfo_1.2.2 pkgbuild_1.4.3 pbapply_1.7-2 [31] DBI_1.2.2 RColorBrewer_1.1-3 FuzzyNumbers_0.4-7 maps_3.4.2 abind_1.4-5 [36] pkgload_1.3.4 purrr_1.0.2 nnet_7.3-19 WorldFlora_1.14-1 ipred_0.9-14 [41] lava_1.7.3 tm_0.7-12 listenv_0.9.1 crul_1.4.0 terra_1.7-71 [46] units_0.8-5 conditionz_0.1.0 parallelly_1.37.1 codetools_0.2-19 xml2_1.3.6 [51] tidyselect_1.2.0 raster_3.6-26 httpcode_0.3.0 viridis_0.6.5 base64enc_0.1-3 [56] jsonlite_1.8.8 e1071_1.7-14 ellipsis_0.3.2 survival_3.5-7 segmented_2.0-3 [61] tools_4.3.2 snow_0.4-4 stringdist_0.9.12 Rcpp_1.0.12 glue_1.7.0 [66] prodlim_2023.08.28 gridExtra_2.3 xfun_0.42 here_1.0.1 usethis_2.2.3 [71] RcppProgress_0.4.2 withr_3.0.0 fastmap_1.1.1 fansi_1.0.6 digest_0.6.34 [76] ConR_2.1 R6_2.5.1 mime_0.12 colorspace_2.1-0 dichromat_2.0-0.1 [81] RSQLite_2.3.5 tidyr_1.3.1 utf8_1.2.4 generics_0.1.3 data.table_1.15.2 [86] robustbase_0.99-2 class_7.3-22 httr_1.4.7 htmlwidgets_1.6.4 tmaptools_3.1-1 [91] whisker_0.4.1 pkgconfig_2.0.3 gtable_0.3.4 blob_1.2.4 brio_1.1.4 [96] htmltools_0.5.7 profvis_0.3.8 dotCall64_1.1-1 scales_1.3.0 png_0.1-8 [101] doSNOW_1.0.20 geohashTools_0.3.3 knitr_1.45 RecordLinkage_0.4-12.4 rstudioapi_0.15.0 [106] rgbif_3.7.9 uuid_1.2-0 nlme_3.1-163 curl_5.2.1 proxy_0.4-27 [111] cachem_1.0.8 zoo_1.8-12 stringr_1.5.1 KernSmooth_2.23-22 miniUI_0.1.1.1 [116] Taxonstand_2.4 desc_1.4.3 leafsync_0.1.0 pillar_1.9.0 grid_4.3.2 [121] vctrs_0.6.5 slam_0.1-50 urlchecker_1.0.1 spatialrisk_0.7.1 promises_1.2.1 [126] flora_0.3.8 ff_4.0.12 ada_2.0-5 xtable_1.8-4 oai_0.4.0 [131] cli_3.6.2 taxize_0.9.100 compiler_4.3.2 rlang_1.1.3 crayon_1.5.2 [136] tmap_3.3-4 nls.multstart_1.3.0 colourvalues_0.3.9 future.apply_1.11.1 classInt_0.4-10 [141] plyr_1.8.9 fs_1.6.3 writexl_1.5.0 stringi_1.8.3 viridisLite_0.4.2 [146] stars_0.6-4 munsell_0.5.0 lazyeval_0.2.2 leaflet_2.2.1 devtools_2.4.5 [151] Matrix_1.6-1.1 bit64_4.0.5 leafem_0.2.3 future_1.33.1 ggplot2_3.5.0 [156] plantR_0.1.6 shiny_1.8.0 evd_2.3-6.1 igraph_2.0.2 memoise_2.0.1 [161] lwgeom_0.2-14 DEoptimR_1.1-3 bit_4.0.5 ape_5.7-1 ```r ```