mervesa / HiCDCPlus

HiCDCPlus
14 stars 2 forks source link

Memory error with the code in the vignette #19

Open mzytnicki opened 2 years ago

mzytnicki commented 2 years ago

Dear authors,

I try to reproduce (part of) the code in the vignette.

    library(HiCDCPlus)
    outdir<-tempdir(check=TRUE)
    construct_features(output_path=paste0(outdir,"/hg38_50kb_GATC"),
                       gen="Hsapiens",gen_ver="hg38",
                       sig="GATC",bin_type="Bins-uniform",
                       binsize=50000,
                       chrs=c("chr22"))
    hicfile_paths<-c(
    system.file("extdata", "GSE131651_NSD2_LOW_arima_example.hic", package = "HiCDCPlus"),
    system.file("extdata", "GSE131651_NSD2_HIGH_arima_example.hic", package = "HiCDCPlus"),
    system.file("extdata", "GSE131651_TKOCTCF_new_example.hic", package = "HiCDCPlus"),
    system.file("extdata", "GSE131651_NTKOCTCF_new_example.hic", package = "HiCDCPlus"))
    for(hicfile_path in hicfile_paths){
            output_path<-paste0(outdir,'/',
                                gsub("^(.*[\\/])", "",gsub('.hic','.txt.gz',hicfile_path)))
            #generate gi_list instance
            gi_list<-generate_bintolen_gi_list(
              bintolen_path=paste0(outdir,"/hg38_50kb_GATC_bintolen.txt.gz"),
              gen="Hsapiens",gen_ver="hg38")
            gi_list<-add_hic_counts(gi_list,hic_path = hicfile_path)
            #expand features for modeling
            gi_list<-expand_1D_features(gi_list)
            #run HiC-DC+ on 2 cores
            set.seed(1010) #HiC-DC downsamples rows for modeling
            gi_list<-HiCDCPlus(gi_list,ssize=0.1)
            #write results to a text file
            gi_list_write(gi_list,fname=output_path)
    }                                                                                                                                                                                                          

I have the following error (translated from French):

Error : R_Reprotect : only 1 element protected, impossible to reprotect index -4

The numbers mentioned (here 1 and -4) vary. Each time I run the code, the numbers change.

Oddly enough, using only one (whichever one) of the file does not trigger an error.

I then run the code with:

R -d valgrind -f test.R

I get:

==3827811== Conditional jump or move depends on uninitialised value(s)                                   
==3827811==    at 0x483F658: memset (vg_replace_strmem.c:1251)                                           
==3827811==    by 0xD31912F: memset (string_fortified.h:71)                                              
==3827811==    by 0xD31912F: select_hits (Hits_class.c:324)          
==3827811==    by 0x4945013: ??? (in /usr/lib/R/lib/libR.so)                                    
==3827811==    by 0x49455C4: ??? (in /usr/lib/R/lib/libR.so)                                             
==3827811==    by 0x498DFE1: ??? (in /usr/lib/R/lib/libR.so)                                             
==3827811==    by 0x499C98F: Rf_eval (in /usr/lib/R/lib/libR.so)
==3827811==    by 0x499E62D: ??? (in /usr/lib/R/lib/libR.so)
==3827811==    by 0x499F44C: Rf_applyClosure (in /usr/lib/R/lib/libR.so)
==3827811==    by 0x498C5FF: ??? (in /usr/lib/R/lib/libR.so)         
==3827811==    by 0x499C98F: Rf_eval (in /usr/lib/R/lib/libR.so)   
==3827811==    by 0x499E62D: ??? (in /usr/lib/R/lib/libR.so)                    
==3827811==    by 0x499F44C: Rf_applyClosure (in /usr/lib/R/lib/libR.so)                                 

The error is triggered by gi_list_write.

I would be most grateful if you had the time to have a look.

Thanks!

Session info:

[1] HiCDCPlus_1.4.2

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.8.3                pillar_1.7.0               
 [3] compiler_4.2.0              restfulr_0.0.14            
 [5] GenomeInfoDb_1.32.2         XVector_0.36.0             
 [7] MatrixGenerics_1.8.0        R.methodsS3_1.8.1          
 [9] bitops_1.0-7                R.utils_2.11.0             
[11] tools_4.2.0                 zlibbioc_1.42.0            
[13] tibble_3.1.7                lifecycle_1.0.1            
[15] BSgenome_1.64.0             lattice_0.20-45            
[17] pkgconfig_2.0.3             rlang_1.0.2                
[19] Matrix_1.4-1                DBI_1.1.2                  
[21] DelayedArray_0.22.0         cli_3.3.0                  
[23] yaml_2.3.5                  parallel_4.2.0             
[25] GenomeInfoDbData_1.2.8      rtracklayer_1.56.0         
[27] dplyr_1.0.9                 generics_0.1.2             
[29] vctrs_0.4.1                 Biostrings_2.64.0          
[31] S4Vectors_0.34.0            IRanges_2.30.0             
[33] tidyselect_1.1.2            stats4_4.2.0               
[35] grid_4.2.0                  data.table_1.14.2          
[37] glue_1.6.2                  Biobase_2.56.0             
[39] R6_2.5.1                    fansi_1.0.3                
[41] XML_3.99-0.10               BiocParallel_1.30.3        
[43] purrr_0.3.4                 magrittr_2.0.3             
[45] ellipsis_0.3.2              Rsamtools_2.12.0           
[47] codetools_0.2-18            matrixStats_0.62.0         
[49] BiocGenerics_0.42.0         GenomicRanges_1.48.0       
[51] GenomicAlignments_1.32.0    splines_4.2.0              
[53] assertthat_0.2.1            SummarizedExperiment_1.26.1
[55] utf8_1.2.2                  RCurl_1.98-1.7             
[57] crayon_1.5.1                rjson_0.2.21               
[59] BiocIO_1.6.0                R.oo_1.24.0                
mervesa commented 1 year ago

The root cause could be the same as that of #1 . Could you please try upgrading your data.table to 1.13.7 and setting setDTthreads(threads = 1) at the header of your script?