sneumann / mzR

This is the git repository matching the Bioconductor package mzR: parser for netCDF, mzXML, mzData and mzML files (mass spectrometry data)
42 stars 27 forks source link

mzR not able to read mzML #73

Closed xdomingoal closed 7 years ago

xdomingoal commented 7 years ago

I'm not currently able to read mzML/mzXML files (attaced) with mzR from QqQ samples acquired by MRM. I've converted the files using Proteowizard, and with different parameters and options. Although the file seems to be converted well (it can be loaded with proteowizard's seeMS), the file seems to be empty when read by mzR.

standmix-5.mzML.zip

lgatto commented 7 years ago

Looking at the content of your mzML file, I can understand why it doesn't work. It doesn't contain any MS data (or any level); it contains chromatograms; full TIC and, as far as I can see 137 individual SRMs (extracted chromatograms).

We don't support that data, mainly because we haven't had the need so far. But, I wouldn't exclude doing some work towards it. What would be your use case, what do you want to use R for?

sneumann commented 7 years ago

so the data is in <chromatogram> tags, not in <spectrum>. It should be fairly easy to add the support, and if xdomingoal wants, we could talk you through and take pull requests. What's needed would be to extend RcppPwiz.cpp and in addition to msd->run.spectrumListPtr also add some kind of msd->run.chromatogramListPtr and of course handling chromatograms similar to SpectrumPtr s = slp->spectrum(whichScan - 1, true); . So no rocket science, but both C++ and R experience required.

xdomingoal commented 7 years ago

Sounds good, I can try. Let me know how to proceed.

lgatto commented 7 years ago

Actually, there is a RcppPwiz::getChromatogramsInfo() already, but it only returns to full chromatogram. The current method would need to be modified to return selected XICs. I will have a look asap.

lgatto commented 7 years ago

I got a bit of time this evening and it seems to work. I will need to do some more testing and add a couple of helper functions and hope to push working code in a couple of days or so.

lgatto commented 7 years ago

@xdomingoal - do you mind if I use the standmix-5.mzML file as SRM test for the new functionality. That file would be added to msdata for general testing purpose. If you do agree (I will acknowledge you in the data file man page), could you send a short description of the data: instrument, basic info about the sample, ...

xdomingoal commented 7 years ago

No problem!

Sample from mouse brain acquired by HILIC ESI-QqQ/MS in Dynamic multiple reaction monitoring mode (MRM). HPLC system was a 1290 Infinity (Agilent Technologies) coupled to ion-Funnel Triple quadrupole 6490 mass spectrometer (Agilent Technologies).

lgatto commented 7 years ago

I have added the MRM file to msdata version 0.15.1, which should become available within 24 hours (on devel only, though).

Tomorrow (most likely), I will push a new version of mzR with support for chromatograms. I will update this issue accordingly.

xdomingoal commented 7 years ago

Thank you!

lgatto commented 7 years ago

I have pushed to github now. Travis fails because mzR now depends on msdata 0.15.1, which is not available yet via biocLite (you'll need to get the source from the svn server if you want if immediately).

I am considering preparing a higher level interface in MSnbase for more convenient manipulation and processing. I fill add a note on this issue linking to a new issue whenever I get there.

lgatto commented 7 years ago

I have pushed to Bioc. The first mzR builds might fail too as data packages are only build 2 times a week (Wed and Sat, I believe) and the latest msdata might not be available. Things should clear out soon, however.