subset retention time alignment

gmhhope commented 2 years ago

Dear XCMS community:

I got the following errors when doing retention time alignment based on peak group alignment on "subset" <- pooled samples (15 pooled samples), but it looks like there are very few peak groups:

Processing 1837110 mz slices ... OK
Performing retention time correction using **339** peak groups.
Applying retention time adjustment to the identified chromatographic peaks ... OK
Warning messages:
1: In do_adjustRtime_peakGroups(chromPeaks(object, msLevel = msLevel),  :
  Span too small for 'loess' and the available number of peak groups, resetting to 0.5
2: In do_adjustRtime_peakGroups(chromPeaks(object, msLevel = msLevel),  :
  Fitted retention time deviation curves exceed points by more than 2x. This is dangerous and the algorithm is probably overcorrecting your data. Consider increasing the span parameter or switching to the linear smoothing method.
3: Adjusted retention times had to be re-adjusted for some files to ensure them being in the same order than the raw retention times. A call to 'dropAdjustedRtime' might thus fail to restore retention times of chromatographic peaks to their original values. Eventually consider to increase the value of the 'span' parameter. 
Processing 1837110 mz slices ... OK

But before this, I have tested RT alignment on the exact same pooled samples together with blank (using pooled samples as a subset and align blank against them, just like #335 ). And I got around only 46 peak groups. But no warnings.

Performing retention time correction using **46** peak groups.
Aligning sample number 1 against subset ... OK
Aligning sample number 2 against subset ... OK
Aligning sample number 15 against subset ... OK
Applying retention time adjustment to the identified chromatographic peaks ... OK
Processing 1270822 mz slices ... OK

Questions

Considering in both cases I aligned different samples based on the same 15 pooled samples, is that the number of peak groups ( used for the retention time correction dependent on what non-adjusted samples used to align against the adjusted vector? Why that happen?
How to address the warning? What solution you may suggest?
BTW, why in first case I don't see Aligning sample number X against subset? Is that because I have so many samples against subset so it doesn't print out or maybe can it suggest issue with my parameters?

Thanks always for your help!

Best regards, Minghao Gong

jorainer commented 2 years ago

If you use the same 15 pooled samples for the subset-based alignment in the first and second (including only blank samples) you should get the same number of peak groups. The subset alignment does the peak group alignment only on the subset samples (i.e. it identified peak groups on them and aligns the subset samples based on them). Then in a second step it aligns all non-subset samples to the aligned subset samples. Thus I don't understand why you see differences there. Could you maybe check again and maybe also paste the code you use for the two alignments here?

gmhhope commented 2 years ago

I totally agreed that it should have the same number of peak groups used for the subsequent alignment so I was confused. I will try to see if something happens in the code and will come back to you. Thanks!

sneumann / xcms

subset retention time alignment #578

Questions