cmap / cmapR

Tools for manipulating annotated data matrices
BSD 3-Clause "New" or "Revised" License
85 stars 34 forks source link

parse_gctx Error: segfault from C stack overflow #65

Open FogatoHub opened 3 years ago

FogatoHub commented 3 years ago

Hi, i need help to use the parsing function of cmapR

I was testing the cmapR library by following the tutorial

and i have problems with the function parse_gctx when i try to parse the small 77kb "modzs_n25x50.gctx" file provided with the cmapR library.

ds_path <- system.file("extdata", "modzs_n25x50.gctx", package="cmapR")
my_ds <- parse_gctx(ds_path)
reading /home/usr/R/x86_64-pc-linux-gnu-library/4.0/cmapR/extdata/modzs_n25x50.gctx
Error: segfault from C stack overflow

same error if i try to parse a subset of the file as described in the tutorial my_ds_10_columns <- parse_gctx(ds_path, cid=1:10)

I checked my memory usage and it's all set to infinity,

> library(unix)
> rlimit_all() 
$cur
      as     core      cpu     data    fsize  memlock   nofile    nproc    stack 
     Inf        0      Inf      Inf      Inf 67108864     8192    63355  8388608 

$max
      as     core      cpu     data    fsize  memlock   nofile    nproc    stack 
     Inf      Inf      Inf      Inf      Inf 67108864  1048576    63355      Inf 

my operative system is Ubuntu 20.04 R version 4.0.3 (2020-10-10) -- "Bunny-Wunnies Freak Out" Copyright (C) 2020 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit)

Thank you

innesbre commented 3 years ago

Same problem here:

> library(cmapR)
> lvl4_data <- parse_gctx("~/Data_LINCS/2020beta/level4_beta_trt_misc_n26428x12328.gctx")
reading ~/Data_LINCS/2020beta/level4_beta_trt_misc_n26428x12328.gctx
Error: segfault from C stack overflow
> R.version
               _                           
platform       x86_64-pc-linux-gnu         
arch           x86_64                      
os             linux-gnu                   
system         x86_64, linux-gnu           
status                                     
major          4                           
minor          0.3                         
year           2020                        
month          10                          
day            10                          
svn rev        79318                       
language       R                           
version.string R version 4.0.3 (2020-10-10)
nickname       Bunny-Wunnies Freak Out 
> sessionInfo()
R version 4.0.3 (2020-10-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.5 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

locale:
 [1] LC_CTYPE=en_CA.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_CA.UTF-8        LC_COLLATE=en_CA.UTF-8    
 [5] LC_MONETARY=en_CA.UTF-8    LC_MESSAGES=en_CA.UTF-8   
 [7] LC_PAPER=en_CA.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] cmapR_1.2.1

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.6                  XVector_0.30.0             
 [3] GenomicRanges_1.42.0        BiocGenerics_0.36.0        
 [5] zlibbioc_1.36.0             IRanges_2.24.1             
 [7] flowCore_2.2.0              lattice_0.20-41            
 [9] GenomeInfoDb_1.26.2         tools_4.0.3                
[11] SummarizedExperiment_1.20.0 parallel_4.0.3             
[13] grid_4.0.3                  rhdf5_2.34.0               
[15] Biobase_2.50.0              matrixStats_0.58.0         
[17] RcppParallel_5.0.2          Matrix_1.3-2               
[19] GenomeInfoDbData_1.2.4      Rhdf5lib_1.12.1            
[21] cytolib_2.2.1               RProtoBufLib_2.2.0         
[23] rhdf5filters_1.2.0          S4Vectors_0.28.1           
[25] bitops_1.0-6                RCurl_1.98-1.2             
[27] DelayedArray_0.16.1         compiler_4.0.3             
[29] MatrixGenerics_1.2.1        stats4_4.0.3 
innesbre commented 3 years ago

I used cmapR::parse.gctx() just fine with R4.0.2 last spring, so its must be a relatively new bug.

RussBainer commented 3 years ago

I'm also catching this bug- any ideas about workarounds?

> ds_10_columns <- parse_gctx(ds_path, cid=1:10, rid= 1:10)
reading GSE92742_Broad_LINCS_Level5_COMPZ.MODZ_n473647x12328.gctx
Error: segfault from C stack overflow

> sessionInfo()
R version 4.0.3 (2020-10-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.4 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/atlas/libblas.so.3.10.3
LAPACK: /usr/lib/x86_64-linux-gnu/atlas/liblapack.so.3.10.3

locale:
 [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8
 [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8
 [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C
[10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] cmapR_1.2.1         ckanr_0.6.0         DBI_1.1.1
[4] BiocManager_1.30.10

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.6                  cytolib_2.2.1
 [3] XVector_0.30.0              pillar_1.5.1
 [5] compiler_4.0.3              dbplyr_2.1.0
 [7] GenomeInfoDb_1.26.4         rhdf5filters_1.2.0
 [9] zlibbioc_1.36.0             MatrixGenerics_1.2.1
[11] bitops_1.0-6                tools_4.0.3
[13] rhdf5_2.34.0                lattice_0.20-41
[15] jsonlite_1.7.2              lifecycle_1.0.0
[17] tibble_3.1.0                debugme_1.1.0
[19] pkgconfig_2.0.3             rlang_0.4.10
[21] Matrix_1.3-2                DelayedArray_0.16.2
[23] crul_1.1.0                  curl_4.3
[25] parallel_4.0.3              GenomeInfoDbData_1.2.4
[27] dplyr_1.0.5                 generics_0.1.0
[29] vctrs_0.3.6                 S4Vectors_0.28.1
[31] IRanges_2.24.1              grid_4.0.3
[33] stats4_4.0.3                tidyselect_1.1.0
[35] Biobase_2.50.0              glue_1.4.2
[37] httpcode_0.3.0              R6_2.5.0
[39] fansi_0.4.2                 Rhdf5lib_1.12.1
[41] RProtoBufLib_2.2.0          purrr_0.3.4
[43] magrittr_2.0.1              ellipsis_0.3.1
[45] matrixStats_0.58.0          BiocGenerics_0.36.0
[47] GenomicRanges_1.42.0        assertthat_0.2.1
[49] SummarizedExperiment_1.20.0 flowCore_2.2.0
[51] utf8_1.2.1                  RcppParallel_5.0.3
[53] RCurl_1.98-1.2              crayon_1.4.1
rajivnarayan commented 3 years ago

I ran into similar segfaulting with R-4.0.3. and the latest cmapR package. Noticed that pre-loading the rhdf5 library before cmapR seems to work for me i.e

library(rhdf5)
library(cmapR)
gctx_file <- system.file('extdata', 'modzs_n25x50.gctx', package='cmapR') 
x <- parse_gctx(gctx_file)

Equivalently rebuilding the cmapR package after adding rhdf5 as a dependency instead of an import in the DESCRIPTION file also works.

Version info: R-4.0.3 cmapR-1.2.1 rhdf5 2.34.4 rhdf5lib 1.10.1

tnat1031 commented 3 years ago

Hi all,

Thanks for the detailed descriptions and my apologies for the late reply. I am actually having trouble reproducing this issue, though admittedly I'm on a Mac running an Ubuntu docker image (rocker/tidyverse:4.0.3) and installing cmapR from Bioconductor.

$ docker run --rm -p 8787:8787 -e PASSWORD=somepassword
rocker/tidyverse:4.0.3

# within RStudio on the docker container
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("cmapR")
library(cmapR)
example("parse_gctx") # runs without errors

It also works fine installing directly onto my local Mac (running R version 4.0.3) from Bioconductor. So I'm a bit puzzled. I'd prefer not to make rhdf5 a dependency for cmapR, as that could potentially lead to namespace conflicts with other packages, but I'm not immediately sure of a better fix. Let me look into it a bit more. I'm of course open to other suggestions.

Thanks a lot, Ted

On Mon, Mar 15, 2021 at 10:17 PM rajivnarayan @.***> wrote:

I ran into similar segfaulting with R-4.0.3. and the latest cmapR package. Noticed that pre-loading the rhdf5 library before cmapR seems to work for me i.e

library(rhdf5) library(cmapR)gctx_file <- system.file('extdata', 'modzs_n25x50.gctx', package='cmapR') x <- parse_gctx(gctx_file)

Equivalently rebuilding the cmapR package after adding rhdf5 as a dependency instead of an import in the DESCRIPTION file also works.

Version info: R-4.0.3 cmapR-1.2.1 rhdf5 2.34.4 rhdf5lib 1.10.1

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/cmap/cmapR/issues/65#issuecomment-799891032, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAO6F3Y2K7I7A2RG3LFA5ZDTD25UJANCNFSM4XL4JMKQ .

RussBainer commented 3 years ago

Following up, I tried the rhdf5 workaround suggested by @rajivnarayan and I continue to get the segfault, but I notice that I am getting a slightly older version from Bioconductor (2.34.0 vs 2.34.4).

Sadly, the current github build of rhdf5 is erroring out on install, so I haven't been able to test further.

tnat1031 commented 3 years ago

Hi Russell,

Sorry to hear you're having trouble with the github source code. I am able to install that successfully on the same ubuntu docker (rocker/tidyverse:4.0.3). Could you please share the error message you're getting?

Thanks a lot, Ted

On Tue, Mar 16, 2021 at 3:47 PM Russell Bainer @.***> wrote:

Following up, I tried the rhdf5 workaround suggested by @rajivnarayan https://github.com/rajivnarayan and I continue to get the segfault, but I notice that I am getting a slightly older version from Bioconductor (2.34.0 vs 2.34.4).

Sadly, the current github build is erroring out on install, so I haven't been able to test further.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/cmap/cmapR/issues/65#issuecomment-800555461, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAO6F35ASORLJBD2FMXESH3TD6YURANCNFSM4XL4JMKQ .

RussBainer commented 3 years ago

Hi Ted,

Here's the error I get from an devtools::install_github() call:

[Many lines of compiler output]

** R
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
Error: package or namespace load failed for ‘rhdf5’ in dyn.load(file, DLLpath = DLLpath, ...):
 unable to load shared object '/usr/local/lib/R/site-library/00LOCK-rhdf5/00new/rhdf5/libs/rhdf5.so':
  /usr/local/lib/R/site-library/00LOCK-rhdf5/00new/rhdf5/libs/rhdf5.so: undefined symbol: H5Scombine_select
Error: loading failed
Execution halted
ERROR: loading failed
* removing ‘/usr/local/lib/R/site-library/rhdf5’
Error: Failed to install 'rhdf5' from GitHub:
  (converted from warning) installation of package ‘/tmp/RtmpscBSTb/file261c4a49497f/rhdf5_2.35.2.tar.gz’ had non-zero exit status

Will try installing into the docker container that you suggest as a workaround. Thanks!

tnat1031 commented 3 years ago

Hi Russell,

Ok, thanks for sharing that error log. I simply ran devtools::install_github() on that docker image and it worked fine. Looks like the issue you're hitting has something to do with the rhdf5 library. Does that package install successfully if you try to install it by itself using BiocManager::install("rhdf5") ?

Thanks a lot, Ted

On Wed, Mar 17, 2021 at 3:15 PM Russell Bainer @.***> wrote:

Hi Ted,

Here's the error I get from an devtools::install_github() call:

[Many lines of compiler output]

** R

** inst

** byte-compile and prepare package for lazy loading

** help

*** installing help indices

** building package indices

** installing vignettes

** testing if installed package can be loaded from temporary location

Error: package or namespace load failed for ‘rhdf5’ in dyn.load(file, DLLpath = DLLpath, ...):

unable to load shared object '/usr/local/lib/R/site-library/00LOCK-rhdf5/00new/rhdf5/libs/rhdf5.so':

/usr/local/lib/R/site-library/00LOCK-rhdf5/00new/rhdf5/libs/rhdf5.so: undefined symbol: H5Scombine_select

Error: loading failed

Execution halted

ERROR: loading failed

  • removing ‘/usr/local/lib/R/site-library/rhdf5’

Error: Failed to install 'rhdf5' from GitHub:

(converted from warning) installation of package ‘/tmp/RtmpscBSTb/file261c4a49497f/rhdf5_2.35.2.tar.gz’ had non-zero exit status

Will try installing into the docker container that you suggest as a workaround. Thanks!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/cmap/cmapR/issues/65#issuecomment-801342693, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAO6F37PG2EEYS6N2RIR7TTTED5TRANCNFSM4XL4JMKQ .

RussBainer commented 3 years ago

No, I'm sorry if my previous note was ambiguous- the github error is derived from the rhdf5 library install, not cmapR.

I'll update when I have had time to try the docker container install- sorry, juggling a lot of things at the moment.

-R


From: Ted Natoli @.> Sent: Thursday, March 18, 2021 5:58 AM To: cmap/cmapR @.> Cc: Russell Bainer @.>; Comment @.> Subject: Re: [cmap/cmapR] parse_gctx Error: segfault from C stack overflow (#65)

Hi Russell,

Ok, thanks for sharing that error log. I simply ran devtools::install_github() on that docker image and it worked fine. Looks like the issue you're hitting has something to do with the rhdf5 library. Does that package install successfully if you try to install it by itself using BiocManager::install("rhdf5") ?

Thanks a lot, Ted

On Wed, Mar 17, 2021 at 3:15 PM Russell Bainer @.***> wrote:

Hi Ted,

Here's the error I get from an devtools::install_github() call:

[Many lines of compiler output]

** R

** inst

** byte-compile and prepare package for lazy loading

** help

*** installing help indices

** building package indices

** installing vignettes

** testing if installed package can be loaded from temporary location

Error: package or namespace load failed for ‘rhdf5’ in dyn.load(file, DLLpath = DLLpath, ...):

unable to load shared object '/usr/local/lib/R/site-library/00LOCK-rhdf5/00new/rhdf5/libs/rhdf5.so':

/usr/local/lib/R/site-library/00LOCK-rhdf5/00new/rhdf5/libs/rhdf5.so: undefined symbol: H5Scombine_select

Error: loading failed

Execution halted

ERROR: loading failed

  • removing ‘/usr/local/lib/R/site-library/rhdf5’

Error: Failed to install 'rhdf5' from GitHub:

(converted from warning) installation of package ‘/tmp/RtmpscBSTb/file261c4a49497f/rhdf5_2.35.2.tar.gz’ had non-zero exit status

Will try installing into the docker container that you suggest as a workaround. Thanks!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/cmap/cmapR/issues/65#issuecomment-801342693, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAO6F37PG2EEYS6N2RIR7TTTED5TRANCNFSM4XL4JMKQ .

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/cmap/cmapR/issues/65#issuecomment-801906231, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AL7CZQLJXBB3KCBURTMMATDTEH2HVANCNFSM4XL4JMKQ.

This email and any attachments may contain CONFIDENTIAL or PRIVILEGED information and is a private communication between the intended addressee and Maze Therapeutics, Inc. If you have received this email in error, reading, copying, using or disclosing its contents to others is prohibited. Please notify us of the delivery error by replying to this message and then delete it from your system. Thank you.

tnat1031 commented 3 years ago

Ok thanks for the clarification and no worries at all. Best of luck and please let me know if I can help in any way.

Best, Ted

On Thu, Mar 18, 2021 at 11:58 AM Russell Bainer @.***> wrote:

No, I'm sorry if my previous note was ambiguous- the github error is derived from the rhdf5 library install, not cmapR.

I'll update when I have had time to try the docker container install- sorry, juggling a lot of things at the moment.

-R


From: Ted Natoli @.> Sent: Thursday, March 18, 2021 5:58 AM To: cmap/cmapR @.> Cc: Russell Bainer @.>; Comment @.> Subject: Re: [cmap/cmapR] parse_gctx Error: segfault from C stack overflow (#65)

Hi Russell,

Ok, thanks for sharing that error log. I simply ran devtools::install_github() on that docker image and it worked fine. Looks like the issue you're hitting has something to do with the rhdf5 library. Does that package install successfully if you try to install it by itself using BiocManager::install("rhdf5") ?

Thanks a lot, Ted

On Wed, Mar 17, 2021 at 3:15 PM Russell Bainer @.***> wrote:

Hi Ted,

Here's the error I get from an devtools::install_github() call:

[Many lines of compiler output]

** R

** inst

** byte-compile and prepare package for lazy loading

** help

*** installing help indices

** building package indices

** installing vignettes

** testing if installed package can be loaded from temporary location

Error: package or namespace load failed for ‘rhdf5’ in dyn.load(file, DLLpath = DLLpath, ...):

unable to load shared object '/usr/local/lib/R/site-library/00LOCK-rhdf5/00new/rhdf5/libs/rhdf5.so':

/usr/local/lib/R/site-library/00LOCK-rhdf5/00new/rhdf5/libs/rhdf5.so: undefined symbol: H5Scombine_select

Error: loading failed

Execution halted

ERROR: loading failed

  • removing ‘/usr/local/lib/R/site-library/rhdf5’

Error: Failed to install 'rhdf5' from GitHub:

(converted from warning) installation of package ‘/tmp/RtmpscBSTb/file261c4a49497f/rhdf5_2.35.2.tar.gz’ had non-zero exit status

Will try installing into the docker container that you suggest as a workaround. Thanks!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/cmap/cmapR/issues/65#issuecomment-801342693, or unsubscribe < https://github.com/notifications/unsubscribe-auth/AAO6F37PG2EEYS6N2RIR7TTTED5TRANCNFSM4XL4JMKQ

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub< https://github.com/cmap/cmapR/issues/65#issuecomment-801906231>, or unsubscribe< https://github.com/notifications/unsubscribe-auth/AL7CZQLJXBB3KCBURTMMATDTEH2HVANCNFSM4XL4JMKQ

.

This email and any attachments may contain CONFIDENTIAL or PRIVILEGED information and is a private communication between the intended addressee and Maze Therapeutics, Inc. If you have received this email in error, reading, copying, using or disclosing its contents to others is prohibited. Please notify us of the delivery error by replying to this message and then delete it from your system. Thank you.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/cmap/cmapR/issues/65#issuecomment-802044109, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAO6F325BVWV3UJCEXL2RBDTEIPJTANCNFSM4XL4JMKQ .

RussBainer commented 3 years ago

Can confirm that the install and run works as promised in the specified docker container; this is a fine workaround IMHO.

Not sure how interested you are in tracking down architecture-specific bugs, but I'm running this on an AWS EC2 instance:

uname -a
Linux 4.15.0-1045-aws #47-Ubuntu SMP Fri Aug 2 13:50:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
tnat1031 commented 3 years ago

Hi Russell,

Ok great, glad the docker container works for you. Thanks for the details on the EC2 instance. If I have time I'll see if I can figure out what's causing the problem.

Best, Ted

On Mon, Mar 22, 2021 at 1:39 PM Russell Bainer @.***> wrote:

Can confirm that the install and run works as promised in the specified docker container; this is a fine workaround IMHO.

Not sure how interested you are in tracking down architecture-specific bugs, but I'm running this on an AWS EC2 instance:

uname -a Linux 4.15.0-1045-aws #47-Ubuntu SMP Fri Aug 2 13:50:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/cmap/cmapR/issues/65#issuecomment-804260604, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAO6F36PMFUHU2I4UEZ7KK3TE56FJANCNFSM4XL4JMKQ .