grimbough / rhdf5

Package providing an interface between HDF5 and R
http://bioconductor.org/packages/rhdf5
60 stars 21 forks source link

Segmentation Fault linked to package loading order #73

Closed mblanche closed 3 years ago

mblanche commented 3 years ago

Hi, for the past two weeks, I've been experciencing segmentation fault with rhdf5. Initially, everything was working fine but I think that I got a new version that started crashing.

The segmentation fault seems to be OS specific. It's happening on Ubuntu 18.04.5 LTS, using R 4.02 but not on Mac. For some reason running sessionInfo() after loading rhdf5 give me two versions, is this normal?

!> sessionInfo()
 R version 4.0.2 (2020-06-22)
 Platform: x86_64-pc-linux-gnu (64-bit)
 Running under: Ubuntu 18.04.5 LTS

 Matrix products: default
 BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
 LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

 locale:
  [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8
  [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8
  [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C
 [10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C

 attached base packages:
 [1] stats     graphics  grDevices utils     datasets  methods   base

 other attached packages:
 [1] rhdf5_2.32.3

 loaded via a namespace (and not attached):
 [1] compiler_4.0.2  tools_4.0.2     parallel_4.0.2  Rhdf5lib_1.10.1

Also, I have a toy case that shows this issue. I have a function, getBins() reading some data from a toy rhdf5 file (https://dovetail-public.s3-us-west-2.amazonaws.com/rhdf5/toy1.mcool). When on Ubuntu, loading rhdf5 prior to InteractionSet let me pull the data from the file, while if I load InteractionSet before rhdf5, I get a segmentation fault every time. One of my colleague tested on a Mac OS X Catalina with no issues but was able reproduce the problem on Ubuntu.

Here's the script, the file toy rhdf5 file can be dowloaded here https://dovetail-public.s3-us-west-2.amazonaws.com/rhdf5/toy1.mcool

################################################################################                                                                                                                                                                                                  
### When loading rhdf5 first, no problem                                                                                                                                                                                                                                          
################################################################################                                                                                                                                                                                                  
library(rhdf5)
library(InteractionSet)

getBins <- function(file,res = NULL){

    bins.group <- ifelse(is.null(res),'/bins',paste('resolutions',res,'bins',sep='/'))

    a.chr <- h5read(file,name=paste(bins.group,'chrom',sep="/"))
    a.start <- h5read(file,name=paste(bins.group,'start',sep="/"))+1
    a.end <- h5read(file,name=paste(bins.group,'end',sep="/"))

    anchors <- GRanges(a.chr,
                       IRanges(a.start,a.end))

}

tmp <- getBins("toy1.mcool",res=1000)

quit('no')
################################################################################                                                                                                                                                                                                  
### Quit R and restart from here                                                                                                                                                                                                                                                  
### If loading after, I get seqmenation faults                                                                                                                                                                                                                                    
################################################################################                                                                                                                                                                                                  
library(InteractionSet)
library(rhdf5)

getBins <- function(file,res = NULL){

    bins.group <- ifelse(is.null(res),'/bins',paste('resolutions',res,'bins',sep='/'))

    a.chr <- h5read(file,name=paste(bins.group,'chrom',sep="/"))
    a.start <- h5read(file,name=paste(bins.group,'start',sep="/"))+1
    a.end <- h5read(file,name=paste(bins.group,'end',sep="/"))

    anchors <- GRanges(a.chr,
                       IRanges(a.start,a.end))

}

tmp <- getBins("toy1.mcool",res=1000)

Any chance you guys could look at this?

Many thanks

Marco

grimbough commented 3 years ago

I can confirm that I can reproduce the problem. This seems sufficient to crash R:

library(InteractionSet)
library(rhdf5)
h5read("toy1.mcool", name = "/resolutions/1000/bins")
# corrupted double-linked list
# [1]    4414 abort (core dumped)  R

Oddly, this seems to work without issue

h5read("toy1.mcool", name = "/resolutions/2000/bins")

And then its possible to read the 1000 group immediately afterwards without a crash. I'll keep digging.

grimbough commented 3 years ago

I think I've tracked this down. Can you give this version a try?

BiocManager::install('grimbough/rhdf5', ref = 'RELEASE_3_11')

For me it can now run both parts of your example code without issue, fingers crossed it the same for you.

mblanche commented 3 years ago

Ok, seems like it's working. Thanks a lot for the fast answer. I'll let you know if anything else pops up

grimbough commented 3 years ago

Great, thanks for reporting back. I'll push the changes to Bioconductor, should take a couple of days before it's available via that route.