Open videlc opened 3 years ago
Firstly, I would appreciate if you could use the newer functions instead of the xcmsSet
call - please have a look at the vignette. The new functions build directly on the MSnbase
package and its objects.
Now, from the chromatograms you extract it looks like you have very narrow peaks (with 3-4 data points only and the peak width being ~ 2 seconds). What I would do is to consider changing the peakwidth
parameter, to e.g. c(1.5, 5)
:
mzfile <- "LGD_DIA_peak-picked.mzML"
mzdata <- readMSData(mzfile, mode = "onDisk")
mzdata <- findChromPeaks(mzdata, CentWaveParam(peakwidth = c(1.5, 5)))
also, maybe rethink the ppm
parameter: this is not the precision of your instrument. It is the maximal acceptable difference of m/z values in consecutive scans - usually it doesn't hurt to increase that value - I use 40ppm for our data, with an instrument that's considered to have a 5ppm error (more info about that parameter provided in the vignette).
Shooting from the hip here:
Hi,
Thanks for your answer. The peak is indeed sharp (~6 s at FWHM) and the low number of points is due to the MS method. Data acquisition was performed on a QE-HF (polarity switching ~~ 1s) with :
This results in a duty cycle between two FullMS (pos) scans of ~~ 2 secs Running the newer command did not perform better :
mzfile <- "LGD_DIA_peak-picked.mzML"
mzdata <- readMSData(mzfile, mode = "onDisk")
mzdata <- findChromPeaks(mzdata, CentWaveParam(peakwidth = c(1.5, 5),ppm = 50,snthresh = 0,noise = 0))
xset_df<-data.frame(mzdata@msFeatureData$chromPeaks)
xset_df[which(xset_df$mz > 383.0835-10/1e6*383.0835 & xset_df$mz < 383.0835+10/1e6*383.0835),]
xset_df[which(xset_df$mz > 339.09266-10/1e6*339.09266 & xset_df$mz < 339.09266+10/1e6*339.09266),]
yields :
However, by taking a look at the results findChrompeaks :
> xset_df[which.max(xset_df$sn),]
mz mzmin mzmax rt rtmin rtmax into intb maxo sn sample
CP0550 338.0823 338.0814 338.085 802.4849 800.3239 806.7729 5242005 5241997 2085130 2085129 1
338.0823 is indeed detected in the pos and neg modes (fun fact, it's the 13c isotope of [LGD4033-H]- in neg mode...). Data looks like this (note that this is RAW data not msconverted/peak-picked). :
m/z values xcms is missing :
The point of centwave ROI detection is valid since i've got few point per peaks. Orbitrap is a slower scanner than a QTOF (but can switch polarities !). Is there any workaround ?
Best regards, Vivian
I would think that you'd need filterPolarity
before peak-picking to make it work.
Do you mean when converting ? Sounds not so convenient.
Isn't polarity=
an option in xcms? It'd be convenient to get an extra column in chrompeaks with polarity.
Very good point indeed @stanstrup ! xcms
considers all spectra in the peak detection - it does not discriminate between pos and neg polarity. This might indeed be the reaason why peak detection fails for your data. So, as Jan suggests, you should split your data set by polarity:
mzdata_pos <- filterPolarity(mzdata, polarity = 1)
mzdata_neg <- filterPolarity(mzdata, polarity = 0)
and perform the peak detection separately for the the two
Yay, indeed super good point !
mzfile <- "LGD_DIA_peak-picked.mzML"
mzdata <- readMSData(mzfile, mode = "onDisk")
mzdata <- filterPolarity(mzdata, polarity = 0)
mzdata <- findChromPeaks(mzdata, CentWaveParam(peakwidth = c(2, 30),ppm = 10,snthresh = 0,noise = 0))
xset_df<-data.frame(mzdata@msFeatureData$chromPeaks)
xset_df[which(xset_df$mz > 383.0835-10/1e6*383.0835 & xset_df$mz < 383.0835+10/1e6*383.0835),]
xset_df[which(xset_df$mz > 339.09266-10/1e6*339.09266 & xset_df$mz < 339.09266+10/1e6*339.09266),]
mzfile <- "LGD_DIA_peak-picked.mzML"
mzdata <- readMSData(mzfile, mode = "onDisk")
mzdata <- filterPolarity(mzdata, polarity = 1)
mzdata <- findChromPeaks(mzdata, CentWaveParam(peakwidth = c(2, 30),ppm = 50,snthresh = 0,noise = 0))
xset_df<-data.frame(mzdata@msFeatureData$chromPeaks)
xset_df[which(xset_df$mz > 339.09266-10/1e6*339.09266 & xset_df$mz < 339.09266+10/1e6*339.09266),]
xset_df[which(xset_df$mz > 383.0835-10/1e6*383.0835 & xset_df$mz < 383.0835+10/1e6*383.0835),]
produces :
While this is a clear solution. I think the issue is closable. However, shouldn't it be automatic (i.e. peak searching each polarity at a time). I don't see why one would consider any m/z that can be detected in both polarities considering the different adducts.
Hi, only slightly related to the actual issue here:
Orbitrap is a slower scanner than a QTOF (but can switch polarities !).
I would be interested whether that it is really the case without any loss
in sensitivity. To test, you'd use the same method as your pos(1)/neg(2) setup above,
but "change" to pos(1)/pos(2). Then you could compare the two pos(1) parts.
My guess would be that it's better than on a TOF, but still some difference in signal.
Curious to hear whether that's indeed the case. Yours, Steffen
I really don't know which would be better, only got a littleexperience with qtofs. However, that exact same sample was run on an impact II in a FullMS/DDA approach as well with much fewer significant hits (i.e. fewer in-vitro generated metabolites detected). However, I would not conclude too fast by saying that pos/neg switching is more sensitive.
From our experience, one clear benefit of pos/neg switching (aside from higher throughput) is that mass spec is less subject to charge effect because it is discharging every time the polarity switches. This is beneficial in terms of sensitivity.
To add to my previous comment @sneumann, I'll do some testing and give you a feedback based on your suggestion. If I undestood clearly, running the same sample with: (1) FullMS(pos)/FullMS(neg) (2) FullMS(pos)/FullMS(pos) and compare various ions with the pos(1) scan.
Thanks anyways. Courious to hear what you think on the automatic polarity switching chrompeak
definition without requiring the filterpolarity
option.
Best regards,
Vivian
Here is the follow-up @sneumann . Same sample (5 µL plasma spe extract) injected twice with two methods : (1) fullMS pos (m/z 50-500) // fullMS neg (m/z 50-500) (2) fullMS pos (m/z 50-500) // fullMS pos(m/z 50-501) (501 to filter it out afterwards)
Converted using msconvert + peakpicking
require(xcms)
testdata_posneg<-c("pMeOH_POSNEG.mzML")
testdata_posneg <- readMSData(testdata_posneg, mode = "onDisk")
testdata_posneg <- filterRt(testdata_posneg,c(400,550))
testdata_posneg <- filterPolarity(testdata_posneg, polarity = 1)
testdata_posneg <- findChromPeaks(testdata_posneg, CentWaveParam(peakwidth = c(2, 30),ppm = 10,snthresh = 5,noise = 100))
testdata_posneg<-data.frame(testdata_posneg@msFeatureData$chromPeaks)
testdata_posneg$sample<-"posneg"
testdata_pospos<-c("pMeOH_POSPOS.mzML")
testdata_pospos <- readMSData(testdata_pospos, mode = "onDisk")
testdata_pospos <- filterRt(testdata_pospos,c(400,550))
testdata_pospos <- testdata_pospos[testdata_pospos@featureData@data$scanWindowUpperLimit==500,]
testdata_pospos <- filterPolarity(testdata_pospos, polarity = 1)
testdata_pospos <- findChromPeaks(testdata_pospos, CentWaveParam(peakwidth = c(2, 30),ppm = 10,snthresh = 5,noise = 100))
testdata_pospos<-data.frame(testdata_pospos@msFeatureData$chromPeaks)
testdata_pospos$sample<-"pospos"
comp<-rbind(testdata_pospos,testdata_posneg)
require(dplyr)
comp %>% group_by(sample) %>% summarise(n=n())
Produces :
More peaks in pospos.
Common peaks area comparison neg vs pos:
comp<-data.frame(mz=testdata_posneg$mz,rt=testdata_posneg$rt,into=testdata_posneg$into,into=testdata_posneg$into)
comp$rt_pospos<-0
comp$mz_pospos<-0
comp$into_pospos<-0
comp$into_pospos<-0
for(n in 1:nrow(comp)){
comp_mz<-comp$mz[n]
comp_rt<-comp$rt[n]
temp<-testdata_pospos[which(testdata_pospos$mz > comp_mz-5*comp_mz/1e6 &
testdata_pospos$mz < comp_mz+5*comp_mz/1e6 &
testdata_pospos$rt > comp_rt-5 &
testdata_pospos$rt < comp_rt+5),]
temp$rtdiff<-abs(temp$rt-comp_rt)
temp<-temp[which.min(temp$rtdiff),]
if(nrow(temp)==1){
comp$rt_pospos[n]<-temp$rt
comp$mz_pospos[n]<-temp$mz
comp$into_pospos[n]<-temp$into
comp$into_pospos[n]<-temp$into
print("found")
}else{print('not found')}
}
require(ggplot2)
ggplot(data=comp,aes(x=into,y=into_pospos))+
geom_point(alpha=0.2)+
geom_abline(intercept = 0,slope = 1)+
scale_x_log10()+
scale_y_log10()
Produces :
Area tend to be higher in posneg and peaks not found in pospos have area in posneg within e5-e6 range (low intensity)
Reverse comparison pos vs neg :
comp<-data.frame(mz=testdata_pospos$mz,rt=testdata_pospos$rt,into=testdata_pospos$into,into=testdata_pospos$into)
comp$rt_posneg<-0
comp$mz_posneg<-0
comp$into_posneg<-0
comp$into_posneg<-0
for(n in 1:nrow(comp)){
comp_mz<-comp$mz[n]
comp_rt<-comp$rt[n]
temp<-testdata_posneg[which(testdata_posneg$mz > comp_mz-5*comp_mz/1e6 &
testdata_posneg$mz < comp_mz+5*comp_mz/1e6 &
testdata_posneg$rt > comp_rt-5 &
testdata_posneg$rt < comp_rt+5),]
temp$rtdiff<-abs(temp$rt-comp_rt)
temp<-temp[which.min(temp$rtdiff),]
if(nrow(temp)==1){
comp$rt_posneg[n]<-temp$rt
comp$mz_posneg[n]<-temp$mz
comp$into_posneg[n]<-temp$into
comp$into_posneg[n]<-temp$into
print("found")
}else{print('not found')}
}
require(ggplot2)
ggplot(data=comp,aes(x=into,y=into_posneg))+
geom_point(alpha=0.2)+
geom_abline(intercept = 0,slope = 1)+
scale_x_log10()+
scale_y_log10()
Produces:
Same area profile with posneg being higher than pospos for common peaks. Pospos only detected peaks are more broadly distributed.
looking at the lowest peak in each chrompeaks :
> testdata_posneg[c(which.min(testdata_posneg$into)) ,]
mz mzmin mzmax rt rtmin rtmax into intb maxo sn sample
CP123 280.1611 280.1608 280.1613 504.102 502.2621 506.8588 82860.25 81077.57 27060.73 10 posneg
>
> testdata_pospos[c(which.min(testdata_pospos$into)) ,]
mz mzmin mzmax rt rtmin rtmax into intb maxo sn
CP0059 192.1742 192.1741 192.1743 419.9922 419.9922 420.7957 12052.23 12051.43 18949.12 27189
sample
CP0059 pospos
Posneg lowest is not detected in pospos and pospos lowest is seen (not detected by xcms) in posneg. Don't know what to conclude about this since i think the fewer datapoints per peaks in posneg (slow polarity switch) is affecting xcms peak detection.
Best regards, Vivian
Hi,
I'm trying to enhance peak detection performances to an existing package by integrating super-fast chromatographic peak detection capabilities of xcms. During control testing performed on data published with that package, I've been surprised to notice that xcms failed at detecting some (I did not test them all...) of the highest peaks present in the polarity switching mzML file.
Example with attempts to detect [LGD4033+HCOO]- or [LGD4033+H]:
returns :
However, peaks are detectable using MSnbase/chromatogram functions :
Produces :
and
Single polarity working example with the detection of cocaine M+H:
Produces :
The major difference that comes to my mind is that the LGD dataset is acquired in fullMS-DIA polarity switching mode whereas cocaine dataset is acquired in fullMS-DIA positive mode only.
Any (other) idea ?
With kindest regards, Vivian