sneumann / mzR

This is the git repository matching the Bioconductor package mzR: parser for netCDF, mzXML, mzData and mzML files (mass spectrometry data)
42 stars 27 forks source link

Crash with pwiz backend on Windows #32

Closed lgatto closed 7 years ago

lgatto commented 8 years ago

mzR is crashing on windows when the pwiz back-end is used. The code works fine on other systems and using the Ramp back-end on Windows. The problem is that this does not happen on every system. The bug has been confirmed with other mzML files.

Reproduce the bug

library(AnnotationHub)
ah <- AnnotationHub()
rw <- ah[["AH49008"]] ## uses pwiz backend by default
head(peaks(rw, 1))

or, with a manual downlaod

url <- ""ftp://ftp.pride.ebi.ac.uk/pride/data/archive/2012/03/PXD000001/TMT_Erwinia_1uLSike_Top10HCD_isol2_45stepped_60min_01-20141210.mzML"
download.file(url, basename(url))
rw <- openMSfile(rw, backend = "pwiz")
head(peaks(rw, 1))

while the following works

url <- ""ftp://ftp.pride.ebi.ac.uk/pride/data/archive/2012/03/PXD000001/TMT_Erwinia_1uLSike_Top10HCD_isol2_45stepped_60min_01-20141210.mzML"
download.file(url, basename(url))
rw <- openMSfile(rw, backend = "Ramp")
head(peaks(rw, 1))

A detailed crash report provided by Dan Tenenbaum

So I debugged as follows:

x <- selectMethod("peaks", "mzRpwiz")
debug(x)
peaks(rw, 1)

Then I kept setting breakpoints deeper down till I got here (line 82 of mzR/R/methods-mzRpwiz.R):

return(object@backend$getPeakList(scans)$peaks)

Looking at the getPeakList function:

Browse[3]> object@backend$getPeakList
Class method definition for method getPeakList()
function (...)
{
    " Rcpp::List getPeakList(int)  \n   docstring : Performs a non-sequential parsing oper
ation on an indexed mzXML file to obtain the peak list for a numbered scan."
    .External(list(name = "CppMethod__invoke_notvoid", address = <pointer: 0x000000000a435
f70>,
        dll = list(name = "Rcpp", path = "E:/biocbld/bbs-3.3-bioc/R/library/Rcpp/libs/x64/
Rcpp.dll",
            dynamicLookup = TRUE, handle = <pointer: 0x000000006abc0000>,
            info = <pointer: 0x000000006dfcbf20>), numParameters = -1L),
        <pointer: 0x000000000a43fa40>, <pointer: 0x0000000012338900>,
        .pointer, ...)
}
<environment: 0x000000001f42fb38>

When I then step into it, I get the crash. sessionInfo() (run right before the line that causes the crash) follows.

R Under development (unstable) (2015-12-14 r69775)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows Server 2008 R2 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets
[8] methods   base

other attached packages:
 [1] mzR_2.5.2                         Rcpp_0.12.2
 [3] ProteomicsAnnotationHubData_1.1.1 curl_0.9.4
 [5] AnnotationHubData_1.1.9           futile.logger_1.4.1
 [7] GenomicRanges_1.23.8              GenomeInfoDb_1.7.3
 [9] IRanges_2.5.18                    S4Vectors_0.9.15
[11] AnnotationHub_2.3.10              BiocGenerics_0.17.2
[13] BiocStyle_1.9.2

loaded via a namespace (and not attached):
 [1] Biobase_2.31.3               httr_1.0.0
 [3] vsn_3.39.1                   jsonlite_0.9.19
 [5] foreach_1.4.3                shiny_0.12.2
 [7] interactiveDisplayBase_1.9.0 affy_1.49.0
 [9] RBGL_1.47.0                  Rsamtools_1.23.1
[11] impute_1.45.0                RSQLite_1.0.0
[13] lattice_0.20-33              limma_3.27.7
[15] chron_2.3-47                 digest_0.6.8
[17] XVector_0.11.1               colorspace_1.2-6
[19] htmltools_0.3                httpuv_1.3.3
[21] preprocessCore_1.33.0        plyr_1.8.3
[23] OrganismDbi_1.13.2           GEOquery_2.37.0
[25] MALDIquant_1.14              XML_3.98-1.3
[27] biomaRt_2.27.2               rBiopaxParser_2.9.0
[29] zlibbioc_1.17.0              xtable_1.8-0
[31] scales_0.3.0                 affyio_1.41.0
[33] BiocParallel_1.5.12          ggplot2_2.0.0
[35] SummarizedExperiment_1.1.12  GenomicFeatures_1.23.15
[37] magrittr_1.5                 mime_0.4
[39] doParallel_1.0.10            graph_1.49.1
[41] BiocInstaller_1.21.2         tools_3.3.0
[43] data.table_1.9.6             stringr_1.0.0
[45] MSnbase_1.19.7               munsell_0.4.2
[47] AnnotationDbi_1.33.4         lambda.r_1.1.7
[49] Biostrings_2.39.6            pcaMethods_1.61.0
[51] mzID_1.9.0                   grid_3.3.0
[53] RCurl_1.95-4.7               iterators_1.0.8
[55] AnnotationForge_1.13.5       bitops_1.0-6
[57] gtable_0.1.2                 codetools_0.2-14
[59] DBI_0.3.1                    reshape2_1.4.1
[61] R6_2.1.1                     GenomicAlignments_1.7.8
[63] knitr_1.11                   rtracklayer_1.31.3
[65] ProtGenerics_1.3.3           futile.options_1.0.0
[67] stringi_1.0-1

In mzR

In openMSfile.Rd, we have.

 \dontrun{
    ## to use another backend
    mz <- openMSfile(file, backend = "pwiz")
    mz
  }

In mzR-class.Rd, we have

 library(msdata)
 filepath <- system.file("microtofq", package = "msdata")
 file <- list.files(filepath, pattern="MM14.mzML",
                     full.names=TRUE, recursive = TRUE)
 mzml <- openMSfile(file, backend = "pwiz")

which seems to work (albeit on a small test file).

What needs to be done

thirdwing commented 8 years ago

I can have a try this weekend. Please wait.

lgatto commented 8 years ago

@thirdwing any luck?

thirdwing commented 8 years ago

Sorry for this. Really no time these days.

On Mon, Jan 18, 2016 at 4:48 PM, Laurent Gatto notifications@github.com wrote:

@thirdwing https://github.com/thirdwing any luck?

— Reply to this email directly or view it on GitHub https://github.com/sneumann/mzR/issues/32#issuecomment-172662916.

Qiang Kou qkou@umail.iu.edu School of Informatics and Computing, Indiana University