sneumann / xcms

This is the git repository matching the Bioconductor package xcms: LC/MS and GC/MS Data Analysis
Other
183 stars 80 forks source link

Retention time alignment using a combination of LamaParama and subset-based alignment #747

Open CLUES-Emory opened 4 months ago

CLUES-Emory commented 4 months ago

Hello, We are trying to explore how to improve the processing speed for the retention time alignment step, which often takes 2-5x longer than the peak detection. One option we are exploring is the LamaParama corrections using our internal standards; however, these are not included in our instrument blanks and since these are not detected the retention times for these injections will not be corrected.

To account for this, we are trying to combine LamaParama with a subset that includes our pooled samples, but I do not see an option for subsetting in the LamaParama parameters (as is available for the other correction methods). Is there anyway to implement this?

Also, any suggestions on how to improve processing time for the alignment step (for studies including 1000-5000 samples) would be greatly appreciated.

As always, thanks for your excellent work on XCMS!

jorainer commented 1 month ago

Sorry for the late reply! I'll add your suggestion to my TODO list. Regarding performance, are you using the newer MsExperiment/XcmsExperiment objects we introduced about a year ago? that should provide some general performance improvements.

CLUES-Emory commented 1 month ago

Thank you! I ended up implementing a rough version of this myself by searching for our internal standards in the grouped feature table, and then inputting the retention times for the standards in each sample as the peakGroupsMatrix during retention time correction. Because we are converting our retention times to alkane indices before peak picking and grouping, the adjusted retention times are negligible and we may skip this step altogether.

Thanks for the suggestion to use the MsExperiment objects. We have updated our code to work with these, and things are running quicker than before.