sneumann / mzR

This is the git repository matching the Bioconductor package mzR: parser for netCDF, mzXML, mzData and mzML files (mass spectrometry data)
42 stars 28 forks source link

[Feature Request]: isolationWindow #169

Closed blosloos closed 6 years ago

blosloos commented 6 years ago

Dear Steffen - thanks for the mzR package and its maintenance; it makes mass spec file handling quite convenient. I just stumbled over a smaller issue and am wondering whether I am doing sth wrong here or not. When trying to extract the mass isolation window of MS2 scans from various .mzXML files, I get an empty matrix, regardless of the settings for the function parameters unique.or simplify in:

library(mzR)
mzXML_file <- openMSfile(filename = file_path)
isolationWindow(mzXML_file, unique. = FALSE, simplify = TRUE)

where file_path is the path to the concerned .mzXML files. I have checked the latter to contain MS2 scans and ML entries such as

<precursorMz precursorScanNum="2" precursorIntensity="6.425451875e05" activationMethod="HCD" windowWideness="80.0">385.0</precursorMz>

and I can extract all other scan infos via header().

Thank you for your help & regards, Martin

sneumann commented 6 years ago

Hi, quick shot while walking, does the same happen if you convert to mzML?


I blame Android for the brevity and typos

lgatto commented 6 years ago

The isolation window accessor searches for cv parameters MS:1000828 (low) and MS:1000829 (high). As @sneumann mentions, these might not be available in mzXML files. They certainly aren't present in the excerpt you show above.

You could check that things work as expected on a test data (mzML in this case) with the following:

> library("msdata")
> f <- proteomics(full.names = TRUE)
> basename(f[3])
> library("MSnbase")
[1] "MS3TMT11.mzML"
> x <- readMSData(f[3], mode = "onDisk")
> isolationWindow(x) ## uses the code from mzR, but works for multiple files
      low high
[1,] 0.35 0.35

and, to confirm that you have a recent version:

> sessionInfo()
R Under development (unstable) (2018-04-02 r74505)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.5 LTS

Matrix products: default
BLAS: /usr/lib/atlas-base/atlas/libblas.so.3.0
LAPACK: /usr/lib/lapack/liblapack.so.3.0

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
[1] msdata_0.21.0       MSnbase_2.7.1       ProtGenerics_1.13.0
[4] BiocParallel_1.15.5 mzR_2.15.1          Rcpp_0.12.17       
[7] Biobase_2.41.0      BiocGenerics_0.27.0

loaded via a namespace (and not attached):
 [1] IRanges_2.15.13       zlibbioc_1.27.0       MASS_7.3-50          
 [4] doParallel_1.0.11     munsell_0.4.3         colorspace_1.3-2     
 [7] impute_1.55.0         lattice_0.20-35       rlang_0.2.0          
[10] foreach_1.4.4         plyr_1.8.4            mzID_1.19.0          
[13] grid_3.6.0            gtable_0.2.0          affy_1.59.0          
[16] iterators_1.0.9       digest_0.6.15         lazyeval_0.2.1       
[19] tibble_1.4.2          preprocessCore_1.43.0 affyio_1.51.0        
[22] ggplot2_2.2.1         S4Vectors_0.19.5      codetools_0.2-15     
[25] MALDIquant_1.17       limma_3.37.1          BiocInstaller_1.31.1 
[28] compiler_3.6.0        pillar_1.2.2          pcaMethods_1.73.0    
[31] scales_0.5.0          stats4_3.6.0          XML_3.98-1.11        
[34] vsn_3.49.0           
blosloos commented 6 years ago

Thank you both for your super-fast replies. Yes, I can extract the isolationWindow(x) when converting the concerned files from mzXML to mzML. For the former format, the precursor mass selection window is saved as tag windowWideness and as cvParam values for the latter format.

As it seems, mzR reads the isolationWindow(x) for mzML but not for mzXML files?

> sessionInfo()
R version 3.5.0 (2018-04-23)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 16299)

Matrix products: default

locale:
[1] LC_COLLATE=German_Switzerland.1252  LC_CTYPE=German_Switzerland.1252    LC_MONETARY=German_Switzerland.1252
[4] LC_NUMERIC=C                        LC_TIME=German_Switzerland.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] mzR_2.14.0           Rcpp_0.12.17         BiocInstaller_1.30.0

loaded via a namespace (and not attached):
[1] compiler_3.5.0      ProtGenerics_1.12.0 parallel_3.5.0      tools_3.5.0         Biobase_2.40.0     
[6] codetools_0.2-15    BiocGenerics_0.26.0 XML_3.98-1.11   
lgatto commented 6 years ago

As it seems, mzR reads the isolationWindow(x) for mzML but not for mzXML files?

Indeed.

It should be easy to update the code to work with mzXML files oo, but I won't have time before July. I'm going to close this issue. If you would like to see this feature added, feel free to re-open the issue and tag it as feature request.

blosloos commented 6 years ago

Thanks again - I will do so :-) https://github.com/holman/ama/issues/47 Just cannot re-open this issue