shuzhao-li-lab / asari

asari, metabolomics data preprocessing
Other
38 stars 9 forks source link

Retention Time Alignment Poorly Regularized Under Some Conditions #49

Closed jmmitc06 closed 1 year ago

jmmitc06 commented 1 year ago

Here is the retention time alignment plot from a recent experiment (RP - pellets, Morphic):

Screenshot 2023-06-08 at 4 44 37 PM

And another with HILIC+ pellets:

Screenshot 2023-06-08 at 3 14 50 PM

Most of the zig-zaggy chaotic regression lines are blanks, but it occassionally with 'real' samples as well. Suspect peak density is a factor here. Unclear if this has a major impact on results or not, especially with filtering.

jmmitc06 commented 1 year ago

Some initial thoughts:

@amnahsiddiqa, if you have ideas for better options than LOWESS, I would like to discuss with you. It would be my 'go-to' method as well but maybe if we put our heads together we can think of something.

shuzhao-li commented 1 year ago

Were the plots based on chromatograms.rt_lowess_calibration_debug or different code?

The RT mapping dictionaries only record values that differ btw two samples. I think rt_lowess_calibration_debug does the right plot.

jmmitc06 commented 1 year ago

These plots were generated using asari dashboard.

I can run it in debug mode and see if the same behavior occurs.

jmmitc06 commented 1 year ago

This issue, while not completely solved, has been partially mitigated by adding a threshold on max RT delta between peaks for alignment and by adding the option to do multiple lowess iterations. The first prevents severe outliers while multiple iterations does improve regularization.

Alignment of blanks remains problematic but likely has no easy solution at this time.

When misalignments occur, the solution so far has been to select a reference manually. Future efforts could seek to improve reference selection. A better algorithm, such as RANSAC, may be a future option also.