sneumann / xcms

This is the git repository matching the Bioconductor package xcms: LC/MS and GC/MS Data Analysis
Other
178 stars 81 forks source link

Postprocessing after chromatographic peak detection #414

Open jorainer opened 4 years ago

jorainer commented 4 years ago

After chromatographic peak detection I find myself frequently looking through identified peaks to see whether they make sense or are just noise (I guess that sounds familiar to most people). Things I frequently encounter are:

What I would propose is to implement a cleanChromPeaks function that could be called after findChromPeaks to clean up messy signal or refine identified peaks. It's signature could be findChromPeaks(object = "XCMSnExp", param) with param being a parameter object defining the settings for a specific cleaning algorithm. Examples could be:

Other implementations could follow. They all should take an XCMSnExp and a parameter object as input and return an XCMSnExp (with cleaned/improved chromatographic peaks).

I guess other people might also have similar utility functions already implemented (@stanstrup , @michaelwitting ?) that could be added too.

stanstrup commented 4 years ago

Good idea. Would be good if it was possible to mark them also. Or in some way inspect that it makes sense what it is doing. Wouldn't the joining now be implicitly done by group?

The only code I have is something runs through all the picked m/z values and tries to guess if the m/z is a contaminant. So possible contaminant if intensity is higher than X for more than Y min. Then for all features within some ppm of a detected contaminant mark the feature as possible contaminant.

jorainer commented 4 years ago

to mark them also.

Should be doable, since we have now the chromPeakData DataFrame that allows to add arbitrary annotations to a chromatographic peak.

joining now be implicitly done by group?

In principle yes, but that does depend on the bw parameter - I'd like to do it before the correspondence.

Contaminant detection sounds like a great addition! This is what a combination of my above proposed methods could also do (first joining neighboring peaks and then removing stuff that is too long). Alternative approaches obviously highly welcome!

cbroeckl commented 4 years ago

I have long been removing exceptionally wide chromatographic peaks after feature detection but before correspondance. Old XCMS data structure:
if(filtPeaks) { orig<-xset@peaks good<-which((orig[,"rtmax"]-orig[,"rtmin"])<(3*maxpw)) filt<-orig[good,] xset@peaks<-filt }

It would be great to make that a base function in XCMS.

jorainer commented 4 years ago

Functionality is now in (master branch) - if someone want to try