workflow4metabolomics / tools-metabolomics

Galaxy tools for metabolomics maintained by Workflow4Metabolomics
https://workflow4metabolomics.org/
GNU General Public License v3.0
24 stars 25 forks source link

XCMS-fillChromPeaks: more options for the remaining NA #141

Open melpetera opened 4 years ago

melpetera commented 4 years ago

Hi there,

Here is a suggestion concerning XCMS step, on how to deal with NA that stay NA even after fillChromPeaks. Currently, we have the option to leave this NA as 'NA' or to convert them into '0'. The idea would be to provide a third choice that provide a controled random value instead of 0.

This random value provided to replace the NA could be define as an integer randomly selected between inf.range et sup.range where:

Note: since it is based on random, it is necessary to provide a "seed" option if needed by the user to obtain similar result if re-run.

@lecorguille do not hesitate to ask if this request is not clear!

Have a nice day, @jfrancoismartin and @melpetera

melpetera commented 4 years ago

Efficient code from @jfrancoismartin

Note:

imputNA <- function(idm,inf.range,sup.range) { ` if (anyNA(idm)) { nbNA <- sum(is.na(idm)) minVal <- min(idm[!is.na(idm)]) idm[is.na(idm)] <- runif(nbNA,min=inf.rangemin(idm,na.rm=TRUE),max=sup.rangemin(idm,na.rm=TRUE)) return(idm) } } DM <- apply(X = DM ,MARGIN = 2, imputNA)`

lecorguille commented 4 years ago

Hum, for me, it's typically something that should be integrated to the xcms main package : https://github.com/sneumann/xcms What do you think about that?

jfrancoismartin commented 4 years ago

hum hum...actually, xcms fillpeaks try to replace NA by a value in extracted from the raw MS file. It is an analytical replacement. If fillpeaks can't find a value then it becomes a statistical issue not in the field of xcms. And we can propose these kind of NA imputation which is more elegant than just 0 replacement.

lecorguille commented 4 years ago

I guess that one of the purposes of xcms is to become an input for statistic analysis. So it could a xcms issue :)

What do you think about that @sneumann and @jorainer?

My idea is to reduce the code in the wrapper. If it's not something that is interesting to add in XCMS, we should add this code in our future utils R package?

jorainer commented 4 years ago

Note, xcms has already some imputation functionality: xcms::imputeRowMinRand and xcms::imputeRowMin. Nothing spectacular though.