Open RogerGinBer opened 1 year ago
Thanks @RogerGinBer for the reproducible example. I'm wondering if that's not a problem with the actual data itself? as you say, for profile data this is very sparse. Don't know which instruments you're using, but also Sciex profile mode data has no 0 intensities and is sparse, but not that much that you have here.
While your workaround seems to work I would maybe prefer to fix the core functions used by peaks_pick
in the MsCoreUtils
package and avoid the additional step of adding 0-values (which will also have a negative impact on the performance). Can you maybe have a look into the noise
, localMaxima
function to evaluate which one would be the one that needs adjustment?
The refineCentroids
also has an issue that needs to be fixed (related to this PR.
Maybe it is a problem with the export tool? E.g. ages ago, compassXport for Bruker MALDI data removes zeros (and very low values) to keep the size of the mz(X)ML files small. If I remember correctly there was no command line argument but you had to set a key it the windows registry to avoid this behaviour.
Thanks to both! Actually, it was as @jorainer pointed out, and the data itself (Bruker, Tims-TOF Pro 2) was actually already centroided, but I failed to notice it :man_facepalming:
Digging deeper into the issue, I found what was the problem that led me to believe the data was profile:
combineSpectra
(context: https://github.com/sneumann/xcms/pull/647) to merge LC-IM-MS scans into a condensed frame scan, the ppm value I set up was too low (regular <5ppm error in a TOF), so the fluctuating mz points were retained separately.Here's what it this fluctuation looks like, around ~8-10 ppm per step (reflects exactly 1 TOF detector cycle difference):
After summarizing the mz values with combineSpectra
the result looks a lot like a profile spectra but with no zero values (and that's where I got confused)
So, in summary, my problem was actually a completely different thing... :sweat_smile:
Could we smooth/correct the mz values across adjacent scans while keeping them as separate? Do we have something for that in Spectra
? Or should I just increase the ppm
parameter to a generous multiple when using combineSpectra
(just like with xcms
's centWaveParam
)?
Re smooth/correct the m/z values across adjacent scans: in MSnbase
we had the combineSpectraMovingWindow
that essentially allowed to smooth spectra along the rt dimension (with a moving window approach). I think we did not (yet) implement that for Spectra
though.
Regarding the ppm error - yes, also our instruments should have < 5ppm error - but looking at the data that seems not to be correct. I usually see a bigger error.
Hi there!
While working with raw LC-IM-MS data (related to https://github.com/sneumann/xcms/pull/647), I've noticed that
pickPeaks
does not correctly centroid the data if no flanking zero values are present. Concretely, it can:refineCentroids
(uses mz points that are far away)Here's a reproducible example, using directly the internal
.peaks_pick
. The input data is a real, rather noisy, TOF scan where, despite being in profile (I've double-checked it), it's very sparse (has no zero values):To work around this problem, I'd like to contribute with this solution (still a sketch, probably should refactor to remove dependency on
R.utils
), which basically appends an arbitrary amount of zero values in the peak matrix when a mass "gap" larger than a tolerance is found (usually not larger than 0.01Da for HRMS instruments). Also, we keep the mz order of the data by imputing mz values in between (see last examples). This function would be called first and then the regularpeakPicks
would go on as usual:And now it would work:
What do you think? Would this internal
.add_profile_zeros
belong here or inMsCoreUtils
?I'll be happy to contribute, :+1: Roger