ropensci / biomartr

Genomic Data Retrieval with R
https://docs.ropensci.org/biomartr
216 stars 29 forks source link

R session aborted in downloading process #75

Closed jimrpy closed 1 year ago

jimrpy commented 3 years ago

I meet such question each time when I try to run the download file functions such as is.genome.available(), getCollection(), getGenome, etc.

I'm really confused, is this a bug? or how to issue it?

RSessionAborted

sessionInfo() R version 4.0.2 (2020-06-22) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS 10.16

Matrix products: default LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages: [1] stats graphics grDevices utils datasets methods
[7] base

other attached packages: [1] biomartr_1.0.2 devtools_2.4.2 usethis_2.0.1

loaded via a namespace (and not attached): [1] Rcpp_1.0.7 prettyunits_1.1.1 ps_1.6.0
[4] Biostrings_2.56.0 assertthat_0.2.1 rprojroot_2.0.2
[7] utf8_1.2.2 BiocFileCache_1.12.1 R6_2.5.1
[10] stats4_4.0.2 RSQLite_2.2.8 httr_1.4.2
[13] pillar_1.6.2 zlibbioc_1.34.0 rlang_0.4.11
[16] progress_1.2.2 curl_4.3.2 rstudioapi_0.13
[19] callr_3.7.0 blob_1.2.2 S4Vectors_0.26.1
[22] desc_1.3.0 stringr_1.4.0 RCurl_1.98-1.4
[25] bit_4.0.4 biomaRt_2.44.4 compiler_4.0.2
[28] pkgconfig_2.0.3 askpass_1.1 BiocGenerics_0.34.0 [31] pkgbuild_1.2.0 openssl_1.4.4 tidyselect_1.1.1
[34] tibble_3.1.4 IRanges_2.22.2 XML_3.99-0.7
[37] fansi_0.5.0 crayon_1.4.1 dplyr_1.0.5
[40] dbplyr_2.1.1 withr_2.4.2 bitops_1.0-7
[43] rappdirs_0.3.3 lifecycle_1.0.0 DBI_1.1.1
[46] magrittr_2.0.1 cli_3.0.1 stringi_1.7.4
[49] cachem_1.0.6 XVector_0.28.0 fs_1.5.0
[52] remotes_2.4.0 testthat_3.0.4 xml2_1.3.2
[55] ellipsis_0.3.2 vctrs_0.3.8 generics_0.1.0
[58] tools_4.0.2 bit64_4.0.5 Biobase_2.48.0
[61] glue_1.4.2 purrr_0.3.4 hms_1.1.0
[64] processx_3.5.2 pkgload_1.2.1 parallel_4.0.2
[67] fastmap_1.1.0 AnnotationDbi_1.5

0.3 sessioninfo_1.1.1
[70] memoise_2.0.0

cmatKhan commented 3 years ago

This happens to me if I run out of RAM. Not sure if that is the issue, so if it is not, ignore what is below.

It looks like you don't have this option in your environment pane -- maybe try updating Rstudio?

image

if you're using a linux OS, you can also try using top/htop

https://htop.dev/

mac and windows will have their own methods of monitoring memory.

jimrpy commented 3 years ago

This is not caused by RAM with my Mac, my Mac has a RAM of 64 Gb.

HajkD commented 2 years ago

Could you please retry with the current developer version? Maybe this issue was related to #76 ?

npokorzynski commented 2 years ago

I am having this same issue, as noted in the screen shot.

Environment is also visible in screen shot but I've copied it here as well:

sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS 12.2.1

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] biomartr_1.0.2

loaded via a namespace (and not attached):
 [1] KEGGREST_1.32.0        progress_1.2.2         tidyselect_1.1.2       purrr_0.3.4            vctrs_0.3.8            generics_0.1.2         stats4_4.1.0           BiocFileCache_2.0.0    utf8_1.2.2             blob_1.2.2            
[11] XML_3.99-0.6           rlang_1.0.2            pillar_1.7.0           glue_1.6.2             DBI_1.1.1              rappdirs_0.3.3         BiocGenerics_0.38.0    bit64_4.0.5            dbplyr_2.1.1           GenomeInfoDbData_1.2.6
[21] lifecycle_1.0.1        stringr_1.4.0          zlibbioc_1.38.0        Biostrings_2.60.2      memoise_2.0.0          Biobase_2.52.0         IRanges_2.26.0         fastmap_1.1.0          biomaRt_2.48.3         GenomeInfoDb_1.28.1   
[31] parallel_4.1.0         curl_4.3.2             AnnotationDbi_1.54.1   fansi_1.0.2            Rcpp_1.0.7             filelock_1.0.2         cachem_1.0.5           S4Vectors_0.30.0       XVector_0.32.0         bit_4.0.4             
[41] hms_1.1.0              png_0.1-7              digest_0.6.29          stringi_1.7.3          dplyr_1.0.8            cli_3.2.0              tools_4.1.0            bitops_1.0-7           magrittr_2.0.2         RCurl_1.98-1.3        
[51] RSQLite_2.2.7          tibble_3.1.6           crayon_1.5.0           pkgconfig_2.0.3        ellipsis_0.3.2         data.table_1.14.2      xml2_1.3.2             prettyunits_1.1.1      assertthat_0.2.1       httr_1.4.2            
[61] rstudioapi_0.13        R6_2.5.1               compiler_4.1.0
Screen Shot 2022-03-15 at 10 33 12 AM
HajkD commented 2 years ago

This is due to the large summary files nowadays provided by NCBI which are loaded into memory as R object. If you run out of memory, then this R session abortion happens. The short-term solution is to try and run the commands on a machine equipped with more memory (RAM). I have the long-term solution on my TODO list where I plan to replace summary file import into memory via readr by a chunk-wise-disk-import via disk.frame. Any help is very welcome.

schymane commented 1 year ago

Not sure if this is remotely helpful (and I am commenting this without looking into the functions you are using here), but I just encountered the same R session aborting repeatedly / reproducibly downloading relatively small files from NCBI as well, where it was working for bigger files of a slightly different format (small = 260 kB, bigger = 1675 kB).

Switching to curl_download {curl} instead of download.file {base R} seems to have fixed the issue. So far ...

Roleren commented 1 year ago

This is not a memory issue, but the older download parameter specification.

Even though the problem is fixed, new problems can emerge if SSL certificates are changed again (for the back end servers like ensembl/ refseq)

If so, please make a new issue.

This issue can now be closed.

HajkD commented 1 year ago

Sounds perfect! Please do feel free to notify us in the future in case this happens again.