ElucidataInc / ElMaven

LC-MS data processing tool for large-scale metabolomics experiments.
https://resources.elucidata.io/elmaven/
GNU General Public License v2.0
87 stars 52 forks source link

Encode per compound rules for peak detection and integration into the compound database - concept for discussion #557

Open lparsons opened 6 years ago

lparsons commented 6 years ago

The idea is to identify particular compounds that require some additional parameter tweaking or peak group selection rules for a given method and encode that into the compound database. Currently, retention time is used as a way to disentangle peaks, but it would potentially be useful to encode some additional "rules" or "hints" to aid in peak detection. Some examples would be:

  1. Compounds that are known to be noisy for a given method, we could encode a different baseline correction (#556).
  2. While rt's may shift, we often know that the order of isomers does change, making a combination of rt and peak order potentially useful.
  3. Adjust smoothing window for noisy compounds
  4. Adjust min quality setting for compounds with less Gaussian shaped peaks (long tails, etc.)
chubukov commented 6 years ago

@lparsons with respect to point 2, one idea I had that could help is to allow alignment with respect to expected retention times . So if you have a set of compounds that's always solid, you can pick those automatically, then align the observed RTs and expected RTs, and hopefully the other compounds now match their expected RTs better as well. Obviously not foolproof, but I think it may work well in many cases. What do you think?