pymzml / pymzML

pymzML - an interface between Python and mzML Mass spectrometry Files
https://pymzml.readthedocs.io/en/latest/
MIT License
162 stars 92 forks source link

Fake peaks with centroided data. #128

Closed Mattstorey closed 5 years ago

Mattstorey commented 5 years ago

Hi,

I have been seeing the issue of very large 'fake' peaks showing up when using .peaks('centroided'). Only happens for some spectra. I see that similar behavior has been seen before but I can't work out how to fix it. Also, when comparing to plots from other software it looks like some of the intensities are not quite right.

Cheers.

MKoesters commented 5 years ago

Hi Mattstorey,

Thank you for reporting this! Can you please send me a mzML and a spectrum ID (or only the affected spectrum as XML tag) where the problem occurs? Also, did you check for these "fake" peaks only in the plot or also in the data array returned by .peaks('centroided')? It would also be good to know if the data in your file is in profile mode or if you performed peak picking when converting from RAW etc. to mzML.

Best, Manuel

Mattstorey commented 5 years ago

Hi,

I did have a look at the (mz , i) values produced from .peaks('centroided') and the 'fake' values are in the output, this was from data converted without peak picking. So it would seem there is something up with the centroiding function in this case.

The problem can be worked around by peak picking during the conversion (with msconvert) and plotting the .peaks('raw') data.

I'm more than happy to send you an example mzml file but I can't upload it here, the file type isn't supported. Is there another way to get you the file?

Cheers, Matt

MKoesters commented 5 years ago

Could you upload it in dropbox/gdrive or any other filehost and share it with me?

Mattstorey commented 5 years ago

Here is a link for you to have a look at!

https://drive.google.com/open?id=15T1MYZuPfg5mpHU6gQz3WevqMIk7jCXX

Spectrums with the IDs: 12535, 13540, 11532, 9523 are good examples of the strange centroided behavior.

Below is an example of the plot for spectrum 13540, I'm not sure if there are enough molecules in the observable universe to get a count that high!

plot

MKoesters commented 5 years ago

Okay thats really strange! I'm going to have a look and keep you updated. In your first post, you said that similar behavior has been seen before, can you point me to an issue etc? Because I can't remember having seen this issue before.

fu commented 5 years ago

Hi Matt,

funny error, I found the bug and fixed it. Please checkout the hotfix/centroiding branch.

Hope that helps

Cheers

.c

Mattstorey commented 5 years ago

Awesome!

MKoesters commented 5 years ago

Fixed with #129 I'll upload a new release to pypi.