sneumann / xcms

This is the git repository matching the Bioconductor package xcms: LC/MS and GC/MS Data Analysis
Other
183 stars 80 forks source link

Wrong peak picking result using CentWave #617

Open a1p6ca opened 2 years ago

a1p6ca commented 2 years ago

Hi,

I found a very strange issue. When I widen rt range in filterRt(), one peak was no longer picked, and I am not sure whether it is related to xcms. My script is as following:

data_prof <- readMSData(file, mode = "onDisk", centroided = FALSE)

data_prof_ft <- data_prof %>% filterRt(rtr) %>% filterMz(mzr) %>% smooth(method = "SavitzkyGolay", halfWindowSize = 4L)

pk1 <- data_prof_ft %>% findChromPeaks( param = CentWaveParam(ppm = 5, peakwidth = c(5,30), snthresh = 10, mzdiff = 0.2, prefilter = c(3,100), noise = 10, fitgauss = T) )

Note peak at rt = 1145s is only picked in figure 1, and the only difference is the RT range.

eic1

eic2

sneumann commented 2 years ago

When widening the RT range, xcms uses different wavelets (see https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-9-504/figures/7 for a picture). Then the first is hidden by the second one. Going up to 30 secs looks a bit too much for your data, where peaks look more like 5 secs wide. Yours, Steffen

jorainer commented 2 years ago

also, restricting on RT might have an influence on the background signal detection which is used for parameter snthresh.

a1p6ca commented 2 years ago

@sneumann @jorainer Thank you for your swift responses. I revised RT range, and peak detection issue seemes addressd. However, when I gave a very short RT range, deviation between extracted and theoretical m/z raised despite ppm is low. It seems hard to decide between recognition of accurate m/z and detection of continuous peaks.

jorainer commented 2 years ago

Note that the ppm parameter in centWave has a slightly different meaning: it's used in the first step of the peak detection and it is the acceptable deviation of individual mass peaks in consecutive scans from the average m/z of the (currently defined) peak. On our data (Sciex TripleTOF) I use a ppm of 50 for centWave and that works pretty well: the difference of the detected peak's m/z and the theoretical m/z is then still below 5ppm for most peaks.

And yes, you're right. Deciding on the best settings is tricky - and will depend also on the instrument and data you have.