sneumann / mzR

This is the git repository matching the Bioconductor package mzR: parser for netCDF, mzXML, mzData and mzML files (mass spectrometry data)
40 stars 26 forks source link

peaks() get wrong intensity #274

Closed NotEvenWron9 closed 1 year ago

NotEvenWron9 commented 2 years ago

Hi, I am trying to extract ms data using peaks(), but I found the intensity is starnge. The mzML file was converted from wiff file using mscovertGUI. And the wiff file was generated in Data Dependent Acquisition mode. Here is the result:

>library(mzR)
>ms=openMSfile("D:\\win c\\fa\\data\\raw data\\test.mzML")
>peaks(ms,1)
               mz intensity
    [1,]  600.2624       255
    [2,]  600.2659         0
    [3,]  600.7576         0
    [4,]  600.7610       255
    [5,]  600.7645         0
    [6,]  600.7749         0
    [7,]  600.7783       255
    [8,]  600.7818       255
    [9,]  600.7853         0
   [10,]  600.7887         0
   [11,]  600.7922       255
   [12,]  600.7957         0
   [13,]  600.7991         0
              ....
  [499,]  642.2661       255
  [500,]  642.2697         0
  [ reached getOption("max.print") -- omitted 11908 rows ]

And the binaryDataArray of intensity in mzML file is like this:

<binaryDataArray encodedLength="1700">
              <cvParam cvRef="MS" accession="MS:1000523" name="64-bit float" value=""/>
              <cvParam cvRef="MS" accession="MS:1000574" name="zlib compression" value=""/>
              <cvParam cvRef="MS" accession="MS:1000515" name="intensity array" value="" unitCvRef="MS" unitAccession="MS:1000131" unitName="number of detector counts"/>
              <binary>eJzt3Uty2zAQhGEfzTfKEbzMNbPMMZKqVGUhmSQeA3TP4Pcm5S8SpgGCICTb1MfH369fPz4/vvt68tHn2fvX/ePVeXe3t6pf6nHM5uoc6vr4hT+sV26urh+WK9m4L/OXcaiSw26ck5xX1evhMa6uj8e6u...
sneumann commented 2 years ago

Hm, indeed intensities of 0 and 255 looks suspicious. Also interesting is that mz acquisition starts with 600.

I have difficulties looking into that with only the excerpt of the data above. Could you attach the mzML here ? On github you might have to disguise it as data.zip.txt . As an alternative, you could have a look at that data using mscat (https://proteowizard.sourceforge.io/tools/tools_base.html) to see if that gives you the same for the first spectrum. Yours, Steffen

NotEvenWron9 commented 2 years ago

Hi, thanks @sneumann, this is the mzML file I have tested. And I think the mz acquisition starts with 600 is because we set scan window lower limit to 600. In addition, the mscat gives the same result as peaks(). test.zip.txt

sneumann commented 1 year ago

Thanks. The good news: mscat and mzR agree on the intensities (although not exactly on the m/z, maybe a different file ?!)

#    scanNumber       msLevel           m/z     intensity
                          ms1      600.1779        255.00
                          ms1      600.1814          0.00
                          ms1      600.6315          0.00
                          ms1      600.6349        255.00
                          ms1      600.6384          0.00
                          ms1      600.7596          0.00
                          ms1      600.7631        255.00
                          ms1      600.7665        255.00
                          ms1      600.7700        255.00

Each vendor / instrument has different noise / maximum intensities. Yours seem to be <1000 :

> library(xcms)
> ms <- readMSData(files=c("YF_sample_STD_10ug_neg4.mzML"), mode = "onDisk")
> summary(unlist(intensity(ms)))
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  0.000   0.000   3.000   6.277   8.000 765.000 

Also note that there seems to be two differen scan types: image

So you'd need to come up with a way to separate or combine them, depending on how they were measured and are intended to be used.

Yours, Steffen

NotEvenWron9 commented 1 year ago

Thanks, it is a different file, I take a subset of the first mzML file to make it smalller, then I can upload it. And when I convert the Wiff file to mzML using msconvertGUI in peak picking mode, peaks() can get right intensities.