ProteoWizard / pwiz

The ProteoWizard Library is a set of software libraries and tools for rapid development of mass spectrometry and proteomic data analysis software.
http://proteowizard.sourceforge.net/
Apache License 2.0
209 stars 96 forks source link

Keeping metadata from Thermo .raw files in .mzML conversion? #371

Open photocyte opened 5 years ago

photocyte commented 5 years ago

Hi there,

I've noticed that Thermo .Raw files have (in a binary format), useful metadata like the sequence (.sld) information associated with the sample injection, and the used instrument method, as well as metadata link the instrument parameters (e.g. Turbopump speed) over the course of the run. This metadata does not make it into a MSConvert converted .mzML file. Would it be possible for MSConvert to maintain this metadata to the best of its ability in the conversion to .mzML format? The sample injection volume & the instrument method used would be especially useful pieces of metadata to transfer. If it is any help to evaluate how difficult implementing this would be, here is a 3rd-party project which interfaces with the proteowizard code to pull out the instrument-related metadata from .raw files: https://bitbucket.org/proteinspector/imondb/wiki/ThermoCompiling

All the best, -Tim

chambm commented 5 years ago

The general consensus is that putting status and error log info in the output file would be too verbose. We already provide ThermoRawMetaDump which can access those as well as the instrument methods. Instrument method as a big text userParam is one thing that could be potentially interesting to store in the mzML. That wouldn't be too verbose.