sneumann / xcms

This is the git repository matching the Bioconductor package xcms: LC/MS and GC/MS Data Analysis
Other
189 stars 80 forks source link

data splitting removes adjustedRtime #224

Closed michaelwitting closed 7 years ago

michaelwitting commented 7 years ago

And here comes the next problem. The filterMsLevel works fine and I get the adjusted retention times. However, to go on I need to split the data according to the precursors to get the data from the different SWATH pockets. This splitting drop the adjustedRtime

ms2data <- filterMsLevel(msdata_aligned, msLevel = 2)
adjustedRtime(ms2data)

ms2data_swath <- split(ms2data, f = as.integer(fData(ms2data)$precursorMZ))
adjustedRtime(ms2data_swath[[1]])

NULL
Warning message:
In .local(object, ...) : No adjusted retention times available.
jorainer commented 7 years ago

I see - I have to think how this could be best implemented. As of now, raw and adjusted retention times do nicely co-exist.

michaelwitting commented 7 years ago

My workaround for the moment would be that I take the data before splitting, make a loess for each file and the apply this loess function to the rtime after splitting. Not elegant, but a solution. Will let you know if it works.

jorainer commented 7 years ago

Intermediate hack: use the applyAdjustedRtime function (e.g. msdata_aligned <- applyAdjustedRtime(msdata_aligned)). This replaces the raw retention times with the adjusted retention times, so any further sub-setting, filtering does not drop them. You wont be able to use the dropAdjustedRtime anymore, nor can you get access to the raw retention times - rtime will then always return only adjusted retention times.

Note: in general you should always use rtime instead of adjustedRtime - if the object has adjusted retention times (i.e. hasAdjustedRtime returns TRUE) it returns always the adjusted retention time, if you need the raw retention times just use rtime(x, adjusted = FALSE).

jorainer commented 7 years ago

With commit https://github.com/sneumann/xcms/commit/40e16c2909293b5dbab8bd4b6999a7df3af6a2b7 you can also use split and provide keepAdjustedRtime = TRUE.

michaelwitting commented 7 years ago

Thanks. I will try that soon and let you know if that works!

michaelwitting commented 7 years ago

I'm getting the following error.

> ms2data_swath <- xcms::split(ms2data, f = as.integer(fData(ms2data)$precursorMZ), keepAdjustedRtime = TRUE)
Note: method with signature ‘OnDiskMSnExp#logicalOrNumeric#missing#missing’ chosen for function ‘[’,
 target signature ‘XCMSnExp#logical#missing#missing’.
 "XCMSnExp#ANY#ANY#ANY" would also be valid
Error in .local(x, i, j, ..., drop) : 
  unused argument (keepAdjustedRtime = TRUE)
jorainer commented 7 years ago

OK, can you please provide what you get for:

selectMethod("[", "OnDiskMSnExp")

and of

selectMethod("[", "XCMSnExp")
michaelwitting commented 7 years ago
> selectMethod("[", "OnDiskMSnExp")
Method Definition:

function (x, i, j = "missing", ..., drop = "missing") 
{
    .local <- function (x, i, j = "missing", drop = "missing") 
    {
        if (!(is.logical(i) | is.numeric(i))) 
            stop("subsetting works only with numeric or logical")
        if (is.numeric(i)) {
            if (max(i) > length(x) | min(i) < 1) 
                stop("subscript out of bounds")
            i <- base::sort(i)
        }
        whichElements <- ls(assayData(x))[i]
        sel <- featureNames(x) %in% whichElements
        file <- base::sort(unique(fromFile(x)[sel]))
        pd <- phenoData(x)[file, , drop = FALSE]
        pData(pd) <- droplevels(pData(pd))
        x@phenoData <- pd
        x@processingData@files <- x@processingData@files[file]
        expD <- experimentData(x)
        expD@instrumentManufacturer <- expD@instrumentManufacturer[file]
        expD@instrumentModel <- expD@instrumentModel[file]
        expD@ionSource <- expD@ionSource[file]
        expD@analyser <- expD@analyser[file]
        expD@detectorType <- expD@detectorType[file]
        x@experimentData <- expD
        newFromFile <- base::match(fromFile(x), file)
        names(newFromFile) <- names(fromFile(x))
        orghd <- header(x)
        olde <- assayData(x)
        newe <- new.env(parent = emptyenv())
        if (length(whichElements) > 0) {
            for (el in whichElements) {
                sp <- olde[[el]]
                sp@fromFile <- unname(newFromFile[el])
                newe[[el]] <- sp
            }
            if (environmentIsLocked(olde)) 
                lockEnvironment(newe, bindings = bindingIsLocked(el, 
                  olde))
        }
        else {
            lockEnvironment(newe, bindings = TRUE)
        }
        x@assayData <- newe
        x@featureData <- featureData(x)[i, ]
        if (is.logical(i)) {
            x@processingData@processing <- c(processingData(x)@processing, 
                paste("Data [logically] subsetted ", sum(i), 
                  " spectra: ", date(), sep = ""))
        }
        else if (is.numeric(i)) {
            x@processingData@processing <- c(processingData(x)@processing, 
                paste("Data [numerically] subsetted ", length(i), 
                  " spectra: ", date(), sep = ""))
        }
        else {
            x@processingData@processing <- c(processingData(x)@processing, 
                paste("Data subsetted ", i, ": ", date(), sep = ""))
        }
        if (x@.cache$level > 0) {
            .cache <- ifelse(length(x) > 1, x@.cache$level, 0)
            x@.cache <- setCacheEnv(list(assaydata = assayData(x), 
                hd = orghd[sel, ]), .cache)
        }
        if (validObject(x)) 
            return(x)
    }
    .local(x, i, j, ..., drop)
}
<environment: namespace:MSnbase>

Signatures:
        x              i     j     drop 
target  "OnDiskMSnExp" "ANY" "ANY" "ANY"
defined "pSet"         "ANY" "ANY" "ANY"

and

> selectMethod("[", "XCMSnExp")
Method Definition:

function (x, i, j, ..., drop = TRUE) 
{
    if (!missing(j)) 
        stop("subsetting by columns ('j') not supported")
    if (missing(i)) 
        return(x)
    else if (!(is.numeric(i) | is.logical(i))) 
        stop("'i' has to be either numeric or logical")
    keepAdjustedRtime <- list(...)$ke
    if (is.null(keepAdjustedRtime)) 
        keepAdjustedRtime <- FALSE
    if (hasFeatures(x) | hasChromPeaks(x)) {
        suppressMessages(x <- dropFeatureDefinitions(x, keepAdjustedRtime = keepAdjustedRtime))
        suppressMessages(x <- dropChromPeaks(x, keepAdjustedRtime = keepAdjustedRtime))
        warning("Removed preprocessing results")
    }
    if (hasAdjustedRtime(x)) {
        if (keepAdjustedRtime) {
            new_adj <- rtime(x, adjusted = TRUE)[i]
            newFd <- new("MsFeatureData")
            newFd@.xData <- .copy_env(x@msFeatureData)
            adjustedRtime(newFd) <- unname(split(new_adj, f = fromFile(x)[i]))
            lockEnvironment(newFd, bindings = TRUE)
            x@msFeatureData <- newFd
        }
        else {
            suppressMessages(x <- dropAdjustedRtime(x))
        }
    }
    callNextMethod()
}
<environment: namespace:xcms>

Signatures:
        x          i     j     drop 
target  "XCMSnExp" "ANY" "ANY" "ANY"
defined "XCMSnExp" "ANY" "ANY" "ANY"
jorainer commented 7 years ago

OK, please update MSnbase using the latest github version devtools::install_github("lgatto/MSnbase"). I changed there the [,OnDiskMSnExp signature - should be working with the new one (it's always a surprise which method R selects for objects with multiple inheritance levels).

michaelwitting commented 7 years ago

Working, at least no more error messages! ;-)

jorainer commented 7 years ago

Then it should be OK. Thanks for testing.

michaelwitting commented 7 years ago

Works also!