sneumann / xcms

This is the git repository matching the Bioconductor package xcms: LC/MS and GC/MS Data Analysis
Other
177 stars 81 forks source link

findChromPeaks returns duplicate peaks #664

Open chuyaowang opened 1 year ago

chuyaowang commented 1 year ago

Hi, I am wondering if you have found what causes duplicate peaks detection as mentioned in issue #284 because I encountered the same problem.

image These two peaks have everything the same, and there are more duplicates like this pair. It is puzzling how they are identified as separate peaks. These are the parameters that I used:

params <- CentWavePredIsoParam(ppm = 10, peakwidth = c(20,70), snthresh = 30, prefilter = c(3,7e5), mzCenterFun = "wMeanApex3", integrate = 2, mzdiff = -0.001, noise = 3e5, maxCharge = 1, maxIso = 5, polarity = "negative" )

I tried to see if refineChromPeaks with MergeNeighboringPeaksParam can merge them into one peak, but these two peaks are removed after merging: image

Based on the function description, I think the merging process should merge the mergeable peaks and leave the rest there, but it removed all the peaks between 71 and 79 and many others. Is this an intended behavior?

chuyaowang commented 1 year ago

My peak merging parameters: mpp <- MergeNeighboringPeaksParam(expandRt = 1, expandMz=0, ppm = 5, minProp=0.75)

jorainer commented 1 year ago

I did not find the reason why sometimes centWave finds duplicated peaks - and I also don't want to change/hack into the original centWave algorithm. For me the refineChromPeaks works well enough and also removes some other peak detection artifacts (like split peaks etc).

The MergeNeighboringPeaksParam that you used considers all peaks that are less than 2 seconds separated from each other for merging. This will merge not only duplicated or completely overlapping peaks but also neighboring peaks for which the signal does not drop below 75% of the smaller peak's intensity. I would suggest to extract a chromatogram from the respective m/z range to see how the data actually looks.

chuyaowang commented 1 year ago

Hi, thanks for your response!

Here is an XIC of a peak within 5 ppm of mz=72.01715: xic I believe the data look normal enough, but there isn't any peak at this mz value after merging. The mz range I set is already very small, so even if there are other peaks within the retention time window, this peak should not be merged with them. I also noticed that if I use CentWaveParam, this error does not happen. Here I made two histograms of the unique mz values in the peak table before and after merging: centwaveprediso With CentWavePredIsoParam, there are fewer unique mz values after merging. centwave With CentWaveParam, the number of unique mz values stay the same before and after merging. Here are my CentWaveParam parameters, essentially the same as before:

CentWaveParam(ppm=10,peakwidth = c(20,70),snthresh = 30,prefilter = c(3,7e5),mzCenterFun = "wMeanApex3",integrate=2,mzdiff=-0.001,noise=3e5)

My data is labeled, so I chose CentWavePredIsoParam for peak picking, but if there is no way to fix the error, will CentWaveParam with more relaxed parameters be able to find the same peaks?

jorainer commented 1 year ago

but there was also no peak before merging for this m/z? I would then also not expect to find any peak after merging... peak merging does not remove peaks, it just combines overlapping or partially overlapping peaks.

chuyaowang commented 1 year ago

There was, and this is what's confusing. Before merging I had mz values like 71, 72, 73...79. After merging I only had 71 and 79, hence the histogram. The light blue bars are the mz values before merging. The red bars are the mz values after merging, which overlap with the light blue bars and make the overlapping region greenish. With CentWaveParam the unique mz values do not change, and the bars completely overlap. With CentWavePredIsoParam, the bars only partially overlap. For the moment I have switched to use CentWaveParam

jorainer commented 1 year ago

This is very strange. MergeNeighboringPeaksParam does not remove peaks, it just merges them. Could it be that the peaks with m/z 71 are at the end of the chromPeaks matrix? Please check the output of chromPeaks after refineChromPeaks again carefully. Also, which versions are you using? Could you please report the output of sessionInfo()?