sneumann / mzR

This is the git repository matching the Bioconductor package mzR: parser for netCDF, mzXML, mzData and mzML files (mass spectrometry data)
40 stars 26 forks source link

Export of non-existing ion mobility data #266

Closed michaelwitting closed 2 years ago

michaelwitting commented 2 years ago

I'm using the MSnbase writeMSData function to export to .mzML after some processing. Using this export the .mzML than contains information on ion mobility, thought it was only normal QToF data. Values are set to nan.

Here is an example: <cvParam cvRef="MS" accession="MS:1002476" name="ion mobility drift time" value="nan" unitCvRef="UO" unitAccession="UO:0000028" unitName="millisecond"/>

Discussion with @jorainer have led to this issue.

Any ideas?

michaelwitting commented 2 years ago

PS: Proteowizard and MSConvert are version 3.0.21079

jorainer commented 2 years ago

what version of mzR are you using?

michaelwitting commented 2 years ago

2.28.0

sneumann commented 2 years ago

Can I have some context ? Would that happen if you exported faahKO that way ? Or was there ion mobility in the original mzML, removed/dropped while processing, and now turning up as NaN ? Yours, Steffen

jorainer commented 2 years ago

Reproducible example:

library(MSnbase)
fl <- system.file("sciex", "20171016_POOL_POS_1_105-134.mzML", package = "msdata")
data <- readMSData(fl, mode = "onDisk")
data <- pickPeaks(smooth(data))
writeMSData(data, file = "test.mzML", copy = TRUE)

the mzML file contains now this funny ion mobility drift time spectra variable.

michaelwitting commented 2 years ago

No, there was no ion mobility data in the original files. It is from a Sciex X500R QToF. I perform smoothing and centroiding using MSnbase and then export to .mzML. I wanted to import a testfile into mzMine for some visualization then got the complain of the software that it cannot deal with nan

jorainer commented 2 years ago

note that the ion mobility drift time is a default header column (spectra variable) that is returned by mzR::header (similar to others like "scan window lower limit", "scan window upper limit" etc). In theory having this header info in the exported mzML file should not hurt - since it is supported by mzML.

michaelwitting commented 2 years ago

I guess it is normally no hurting, but mzMine for example cannot deal with the nan as value in there.

jorainer commented 2 years ago

@sneumann , do you know by chance what the correct encoding for a missing value would be in mzML? I assumed that proteowizard would handle that correctly...

jorainer commented 2 years ago

Hm - strange. If the ionMobilityDriftTime is NA it should actually not be exported: https://github.com/sneumann/mzR/blob/master/src/RcppPwiz.cpp#L768:L770 - there must be something strange going on.

michaelwitting commented 2 years ago

I checked the original .mzML file, there is no single mobility thing mentioned... strange... I have it already after reading the data with readMSdata, but all the entries are NA.

michaelwitting commented 2 years ago

After performing smoothing and centroiding, still all of the are NA

michaelwitting commented 2 years ago

I checked it via @featureData@data

jorainer commented 2 years ago

yes, it seems to be a bug in my Rcpp code - in fact, NA variables should not be exported, but it seems I'm not correctly testing for NA.

jorainer commented 2 years ago

Fixed in PR #267