sneumann / xcms

This is the git repository matching the Bioconductor package xcms: LC/MS and GC/MS Data Analysis
Other
178 stars 81 forks source link

Obiwarp versus Peakgroups for rentention alignment #570

Closed gmhhope closed 2 years ago

gmhhope commented 3 years ago

Dear XCMS team,

I am trying to find a practical guideline for choosing Obiwarp or Peakgroups in retention time alignment and I cannot find an exact matched Github issue here. In the Vignette, it gives me the impression that using a subset of QC (e.g., pooled samples and blank) for subset alignment, by default, should go with Peakgroups. In contrast, Obiwarp is preferred to be used for alignment to a centered sample.

However, I look into the function and some issues and realized that both Obiwarp and Peakgroups can do the subset alignment. So somehow I felt lost and would like to know your opinion on both methods. Do they have practical merits over each other?

For example:

  1. I don't see issues with my small data with homogeneous sample types. But imagine if I have a dataset that has very different sample types and I would like to align them together? (I guess in this case, Obiwarp may be a better bet?)
  2. What if I am going to handle some bigger datasets which may involve many interpolating QC samples across 200-300 samples. Should I use Obiwarp or Peakgroups for subset alignment?

Thanks very much for any of your feedbacks.

Best, Minghao Gong

gmhhope commented 3 years ago

Related issues: #506, #345

jorainer commented 3 years ago

Tricky question - I guess there is no golden rule which method to use when. At the beginning I preferred obiwarp because it does the alignment without needing to perform any correspondence analysis beforehand. But then I got some strange alignment results with obiwarp that I did not understand or make sense - and I found it very hard to tune then obiwarp to result in "meaningful" results. So, my default at the moment is subset-based peakgroups - for any data set (because you'll always have QC pool samples in which you can expect the same features to be present).

Also for very heterogeneous samples I would prefer peakgroups over obiwarp, because with the hook peaks (or how you want to call the peak groups) you have IMHO a better control over which signal is actually used for the alignment - you could even manually define the peak groups and feed that to the method. Obiwarp uses all the signal so it might have hard times aligning the data if there are large parts of LC-MS data that do not fit across samples.

In the end the best method is the one you feel most comfortable with I guess...

xiaodfeng commented 1 year ago

This discussion is quite helpful, I came across the similar issue of retention time shift in a large LC-MS dataset, and I found subset-based peakgroups algorithm outperformed the Obiwarp algorithm. Related scripts are: F_GroupingAdjustRtime<-function(xdata,pgp,pdp){ xdata <- dropAdjustedRtime(xdata) xdata <- groupChromPeaks(xdata, param = pdp) xdata <- adjustRtime(xdata, param = pgp) # xdata <- groupChromPeaks(xdata, param = pdp) xdata <- fillChromPeaks(xdata) return(xdata) }