Closed sgibb closed 8 years ago
I'll start with some clarifications
readMSData[2]
have changed from FALSE
to NA
, which is why some changes in these if
tests have been made and the example now fails (sorry for the oversight). This has been fixed now (see 545280fd7cb27b8702b07a24f866c6e4b0127649). Obviously, there is a need for revisiting how the mode is assessed.isCentroided
estimates whether the spectrum in centroided based on the raw data. To access the slot, there's the centroided
accessor. We should always use the latter (or @centroided
) in tests; the former is meant to be used when this information is not available.I think we probably will want to identify different cases. Either, NA
is acceptable, or one must be able to determine the mode before proceeding. In these cases, one should rely on the slot.
If the slot is NA
, we could fall back on isCentroided
, but it doesn't have 100% accuracy for individual spectra, especially if we consider different MS levels and different peak picking algorithms (see #131). In such situations, I think it is safer to stop with an error, and ask the user to update the object. isCentroided
works well for a full experiment, if one assumes that all MS levels have the same mode. In that case, the majority of spectra will give the right answer, which can be used to set the mode.
> f
[1] "/home/lg390/R/x86_64-pc-linux-gnu-library/3.3/msdata/proteomics/TMT_Erwinia_1uLSike_Top10HCD_isol2_45stepped_60min_01.mzML.gz"
> f <- msdata::proteomics(full.names = TRUE)
> x <- readMSData2(f)
Reading 509 spectra from file TMT_Erwinia_1uLSike_Top10HCD_isol2_45stepped_60min_01.mzML.gz
> all(is.na(centroided(x)))
[1] TRUE
> invisible(isCentroided(x))
invisible(isCentroided(x))
ctrd 1 2
FALSE 58 0
TRUE 0 451
> centroided(x, msLevel = 1) <- FALSE
> centroided(x, msLevel = 2) <- TRUE
> table(centroided(x))
FALSE TRUE
58 451
May we could have something like this at the beginning of functions that require the mode to be defined.
modeIsDefined <- function(x) {
.def <- all(!is.na(centroided(x)))
if (!.def) stop("The mode of your data is undefined. See ?isCentroided for details.")
invisible(TRUE)
}
So for pickPeaks
, we would have
if (modeIsDefined(object) & centroided(x)) {
message("Object already centroided")
return(object)
}
For estimateNoise
, we would have
if (modeIsDefined(object) & !centroided(object)) {
warning("Noise estimation is only supported for profile spectra.")
return(matrix(NA, nrow=0L, ncol = 2L, dimnames = list(c(), c("mz", "intensity"))))
}
Any thoughts?
Looks good. What about a single function to save some keystrokes and errors (maybe the name is not the best, could lead to confusion with isCentroided)?
isCentroidedMode <- function(x) {
cd <- centroided(x)
if (anyNA(cd))
stop("The mode of your data is undefined. See ?isCentroided for details.")
cd
}
Or, we could add a strict argument to centroided
:
centroided(x) ## NA
centroided(x, strict = TRUE) ## error
This is even better than a new function. "strict" is a good name for the argument. Nevertheless I like to suggest "na.fail" (default: FALSE) to be in line with stats::na.fail
:
centroided(x) ## NA
centroided(x, na.fail=TRUE) ## error
In
pickPeaks
you find the following lines to test for centroided mode: functions-Spectrum.R:395and in
estimateNoise
: functions-Spectrum.R:376These lines are the reason why the example of
?plot.Spectrum.Spectrum
is not working anymore: man/plotSpectrumSpectrum-methods.Rd:62-73While the test in
pickPeaks
results in a non-centroided spectrum, the test inestimateNoise
assumes a centroided on and returns a matrix ofNA
. That's why SNR comparison fails and no peak is found, even worse an empty spectrum is returned.We should replace both tests (and maybe more in other functions) by a
isCentroided
method that handle these cases. The currentisCentroided,Spectrum-method
uses thequantile
function and doesn't investigate thecentroided
slot. Using the quantile method in all subroutines would unnecessarily slow down the processing. So we should rely on the slot, or?