lgatto / MSnbase

Base Classes and Functions for Mass Spectrometry and Proteomics
http://lgatto.github.io/MSnbase/
123 stars 50 forks source link

Extract chromatogram object(s) from mzML Data #556

Closed bathmer closed 6 months ago

bathmer commented 2 years ago

Hi,

I have problems importing mzML/mzXML files with the readMSData function. They are originally from Agilent ".D" which I converted to mzML/mzXML (vers. 1.1 ) using OPENCHROME (vers. 1.4). If I run the following:

od <- readMSData("MAP0483.mzML",  mode = "onDisk")

MAP0483.zip MAP0483.D.zip

tic <- chromatogram(od, aggregationFun = "sum")
Error: BiocParallel errors
  element index: 1
  first error: BiocParallel errors
  element index: 1
  first error: [IO::HandlerBinaryDataArray] At position 845: encoded lengths differ.

How can import these files?

Benedikt


library(MSnbase)
Lade nötiges Paket: BiocGenerics
Lade nötiges Paket: parallel

Attache Paket: ‘BiocGenerics’

Die folgenden Objekte sind maskiert von ‘package:parallel’:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ, clusterExport, clusterMap, parApply, parCapply, parLapply, parLapplyLB,
    parRapply, parSapply, parSapplyLB

Die folgenden Objekte sind maskiert von ‘package:stats’:

    IQR, mad, sd, var, xtabs

Die folgenden Objekte sind maskiert von ‘package:base’:

    anyDuplicated, append, as.data.frame, basename, cbind, colnames, dirname, do.call, duplicated, eval, evalq, Filter, Find, get,
    grep, grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, Position,
    rank, rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply, union, unique, unsplit, which.max, which.min

Lade nötiges Paket: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with 'browseVignettes()'. To cite Bioconductor, see 'citation("Biobase")', and for
    packages 'citation("pkgname")'.

Lade nötiges Paket: mzR
Lade nötiges Paket: Rcpp
Lade nötiges Paket: S4Vectors
Lade nötiges Paket: stats4

Attache Paket: ‘S4Vectors’

Die folgenden Objekte sind maskiert von ‘package:base’:

    expand.grid, I, unname

Lade nötiges Paket: ProtGenerics

Attache Paket: ‘ProtGenerics’

Das folgende Objekt ist maskiert ‘package:stats’:

    smooth

This is MSnbase version 2.18.0 
  Visit https://lgatto.github.io/MSnbase/ to get started.

Attache Paket: ‘MSnbase’

Das folgende Objekt ist maskiert ‘package:base’:

    trimws

Warning message:
In fun(libname, pkgname) :
  mzR has been built against a different Rcpp version (1.0.6)
than is installed on your system (1.0.7). This might lead to errors
when loading mzR. If you encounter such issues, please send a report,
including the output of sessionInfo() to the Bioc support forum at 
https://support.bioconductor.org/. For details see also
https://github.com/sneumann/mzR/wiki/mzR-Rcpp-compiler-linker-issue.
> sessionInfo()
R version 4.1.1 (2021-08-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)

Matrix products: default

locale:
[1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252    LC_MONETARY=German_Germany.1252 LC_NUMERIC=C                   
[5] LC_TIME=German_Germany.1252    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] MSnbase_2.18.0      ProtGenerics_1.24.0 S4Vectors_0.30.2    mzR_2.26.1          Rcpp_1.0.7          Biobase_2.52.0      BiocGenerics_0.38.0

loaded via a namespace (and not attached):
 [1] BiocManager_1.30.16   plyr_1.8.6            compiler_4.1.1        pillar_1.6.4          iterators_1.0.13      zlibbioc_1.38.0      
 [7] tools_4.1.1           digest_0.6.28         MALDIquant_1.20       ncdf4_1.17            preprocessCore_1.54.0 lifecycle_1.0.1      
[13] tibble_3.1.5          gtable_0.3.0          lattice_0.20-45       clue_0.3-60           pkgconfig_2.0.3       rlang_0.4.12         
[19] foreach_1.5.1         DBI_1.1.1             dplyr_1.0.7           cluster_2.1.2         IRanges_2.26.0        generics_0.1.1       
[25] vctrs_0.3.8           MsCoreUtils_1.4.0     grid_4.1.1            tidyselect_1.1.1      glue_1.4.2            impute_1.66.0        
[31] R6_2.5.1              fansi_0.5.0           XML_3.99-0.8          BiocParallel_1.26.2   limma_3.48.3          ggplot2_3.3.5        
[37] purrr_0.3.4           magrittr_2.0.1        pcaMethods_1.84.0     scales_1.1.1          codetools_0.2-18      ellipsis_0.3.2       
[43] MASS_7.3-54           mzID_1.30.0           assertthat_0.2.1      colorspace_2.0-2      utf8_1.2.2            affy_1.70.0          
[49] doParallel_1.0.16     munsell_0.5.0         vsn_3.60.0            crayon_1.4.2          affyio_1.62.0        
lgatto commented 2 years ago

I have no experience with Agilent data and their conversion to mzML. In this case, the mzML files aren't compatible with MSnbase (or Spectra, which I also tested), but I can't say whether this is a general problem with Agilent data or if it is related to the conversion.

Maybe @jorainer has an idea?

jorainer commented 2 years ago

Seems that proteowizard (which is used by the mzR package to import the data) has a problem with the data. I've never used openchrome for conversion - maybe you try to convert the files using proteowizard's msconvert?