smith-chem-wisc / mzLib

Library for mass spectrometry projects
GNU Lesser General Public License v3.0
26 stars 33 forks source link

support for <referenceableParamGroup> / SCIEX WIFF converter #466

Open ftwkoopmans opened 5 years ago

ftwkoopmans commented 5 years ago

WIFF files converted to mzML using the SCIEX data converter yield an error in MetaMorpheus, Run failed, Exception: !msOrder.HasValue || !isCentroid.HasValue, while converting the same WIFF using ProteoWizard's qtofpeakpicker works great.

A key difference is probably the extensive use of <referenceableParamGroupRef> in the mzML files generated by the SCIEX tool, perhaps the enclosed properties are not taken into account when parsing a <spectrum> element. We worked through a similar problem with pymzML a while ago, please check the mzML snippet I posted at @ https://github.com/pymzml/pymzML/issues/92

Looking through the test mzML files in the mzLib repository, I found one that does contain a <referenceableParamGroupRef> (https://github.com/smith-chem-wisc/mzLib/blob/84a58e9cd62feb7bf38cc928c6a7b0b132b18a9b/Test/tiny.pwiz.1.1.mzML) element but it seems that the required <cvParam cvRef="MS" accession="MS:1000127" name="centroid spectrum" value=""/> is not in the <referenceableParamGroupRef> as it is in my mzML files but instead is repeated for every <spectrum>. So perhaps a few minor adjustments (eg; moving all "centroid spectrum" and "ms level" properties to the param group) to that test mzML could create a mock mzML analogous to my use-case so you can unit test this in the future.

I've uploaded an mzML generated by both qtofpeakpicker and the SCIEX tool for testing @ https://surfdrive.surf.nl/files/index.php/s/pV1GXjfPHONyajx

rmillikin commented 5 years ago

Thanks for the thorough investigation and the tips. We use MsConvert all the time, but lack access to WIFF files and the SCIEX converters, so these things are a bit hard to catch for us. We'll investigate this soon.

ftwkoopmans commented 5 years ago

I've had better results with the SCIEX converter over msconvert for a few datasets in the past (different search engines ofc) so I'm curious to see if this has any effect in a MetaMorpheus workflow.

For reference, I've also uploaded the WIFF file matching the mzML files @ https://surfdrive.surf.nl/files/index.php/s/sJyaksHnzJBI9pl. You can can download the SCIEX converter for free at https://sciex.com/software-support/software-downloads (usage; AB_SCIEX_MS_Converter.exe WIFF inputfile.WIFF -centroid MZML outputfile.mzML ... but note that it only works on ancient windows installations, not win8/win10). Anyway, you can most likely get by with the information and files from my initial comment.