alexpiper / taxreturn

An R package for creating taxonomic reference databases for metabarcoding studies
GNU General Public License v3.0
8 stars 1 forks source link

error in fetchSeqs(), object 'search_results' not found #22

Closed morien closed 3 years ago

morien commented 3 years ago

I'm getting an error relating to search results from GenBank. This error happens whether I'm looking for mitochondrial sequences, or for a specific gene (CO1). Hoping the devs can help me debug/figure out what the real issue is. I can tell that it has to do with the results table being returned after a query, and that the table is likely empty. That seems like the kind of thing that's normally dealt with using exceptions if/else statements, so I am guessing it's not so simple as that. The error only comes up if I'm passing a list of taxa, not if I query them individually.

fetchSeqs(species_list, database="genbank", out.dir="genbank", quiet=FALSE, marker="mitochondria", output = "gb-binom", force=TRUE, compress=TRUE, multithread = TRUE)
Multithreading with 47 cores
Downloading from genbank - No subsampling

Input marker is mitochondria, Downloading full mitochondrial genomes
Searching genbank with query:([ORGN]) AND mitochondrion[filter] AND genome
Input marker is mitochondria, Downloading full mitochondrial genomes
Searching genbank with query:(Aaaba fossicollis[ORGN]) AND mitochondrion[filter] AND genome
Error in data.frame(taxon = x, seqs_total = length(search_results$ids),  : 
  object 'search_results' not found

Whereas, if I try this species in a standalone query, no error gets produced:

> fetchSeqs("Aaaba fossicollis", database="genbank", out.dir="genbank", quiet=FALSE, marker="mitochondria", output = "gb-binom", force=TRUE, compress=TRUE, multithread = TRUE)
Multithreading with 47 cores
Downloading from genbank - No subsampling
Input marker is mitochondria, Downloading full mitochondrial genomes
Searching genbank with query:(Aaaba fossicollis[ORGN]) AND mitochondrion[filter] AND genome
              taxon seqs_total seqs_downloaded       marker database
1 Aaaba fossicollis          0               0 mitochondria  nuccore
                 time
1 2020-11-16 15:09:28

I know it's a lot of text, but I'm going to paste my sessioninfo below:

R version 3.6.3 (2020-02-29)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: CentOS Linux 8 (Core)

Matrix products: default
BLAS/LAPACK: /usr/lib64/libopenblas-r0.3.3.so

locale:
 [1] LC_CTYPE=en_CA.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_CA.UTF-8        LC_COLLATE=en_CA.UTF-8    
 [5] LC_MONETARY=en_CA.UTF-8    LC_MESSAGES=en_CA.UTF-8   
 [7] LC_PAPER=en_CA.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] taxreturn_0.0.0.9000 forcats_0.5.0        stringr_1.4.0       
 [4] dplyr_1.0.2          purrr_0.3.4          readr_1.4.0         
 [7] tidyr_1.1.2          tibble_3.0.4         ggplot2_3.3.0       
[10] tidyverse_1.3.0      insect_1.2.0         ape_5.4-1           
[13] Biostrings_2.54.0    XVector_0.26.0       IRanges_2.20.2      
[16] S4Vectors_0.24.4     BiocGenerics_0.32.0 

loaded via a namespace (and not attached):
  [1] colorspace_1.4-1        seqinr_4.2-4            ellipsis_0.3.1         
  [4] fs_1.5.0                rstudioapi_0.11         httpcode_0.3.0         
  [7] listenv_0.8.0           furrr_0.2.1             bit64_4.0.5            
 [10] fansi_0.4.1             lubridate_1.7.8         xml2_1.3.2             
 [13] codetools_0.2-16        R.methodsS3_1.8.1       mnormt_2.0.2           
 [16] bold_1.1.0              phylogram_2.1.0         ade4_1.7-16            
 [19] jsonlite_1.7.1          entropy_1.2.1           broom_0.5.6            
 [22] dbplyr_1.4.3            R.oo_1.24.0             data.tree_1.0.0        
 [25] rentrez_1.2.2           compiler_3.6.3          httr_1.4.2             
 [28] backports_1.1.6         assertthat_0.2.1        Matrix_1.2-18          
 [31] aphid_1.3.3             cli_2.1.0               prettyunits_1.1.1      
 [34] tools_3.6.3             igraph_1.2.6            taxize_0.9.99          
 [37] coda_0.19-4             gtable_0.3.0            glue_1.4.2             
 [40] RANN_2.6.1              clusterGeneration_1.3.5 biofiles_1.0.0.9000
[43] reutils_0.2.3           maps_3.3.0              fastmatch_1.1-0        
 [46] Rcpp_1.0.5              cellranger_1.1.0        vctrs_0.3.4            
 [49] crul_1.0.0              nlme_3.1-147            conditionz_0.1.0       
 [52] DECIPHER_2.14.0         iterators_1.0.13        globals_0.13.1         
 [55] ps_1.3.2                rvest_0.3.6             lifecycle_0.2.0        
 [58] phangorn_2.5.5          gtools_3.8.2            XML_3.99-0.3           
 [61] future_1.20.1           zoo_1.8-8               zlibbioc_1.32.0        
 [64] MASS_7.3-51.6           scales_1.1.0            vroom_1.3.2            
 [67] hms_0.5.3               expm_0.999-5            curl_4.3               
 [70] memoise_1.1.0           pbapply_1.4-3           reshape_0.8.8          
 [73] stringi_1.5.3           RSQLite_2.2.1           plotrix_3.7-8          
 [76] foreach_1.5.1           kmer_1.1.2              phytools_0.7-70        
 [79] rlang_0.4.8             pkgconfig_2.0.3         bitops_1.0-6           
 [82] lattice_0.20-41         bit_4.0.4               tidyselect_1.1.0       
 [85] parallelly_1.21.0       plyr_1.8.6              magrittr_1.5           
 [88] R6_2.5.0                generics_0.1.0          combinat_0.0-8         
 [91] DBI_1.1.0               pillar_1.4.6            haven_2.2.0            
 [94] withr_2.3.0             scatterplot3d_0.3-41    RCurl_1.98-1.2         
 [97] modelr_0.1.6            crayon_1.3.4            uuid_0.1-4             
[100] tmvnsim_1.0-2           progress_1.2.2          grid_3.6.3             
[103] readxl_1.3.1            data.table_1.13.2       blob_1.2.1             
[106] reprex_0.3.0            digest_0.6.27           numDeriv_2016.8-1.1    
[109] R.utils_2.10.1          openssl_1.4.3           munsell_0.5.0          
[112] quadprog_1.5-8          askpass_1.1
alexpiper commented 3 years ago

I've changed the error handling in the fetchSeqs function which I believe should resolve this issue. Install the package from GitHub again, and let me know if you are still encounter this issue.