sgibb / MALDIquant

Quantitative Analysis of Mass Spectrometry Data
https://strimmerlab.github.io/software/maldiquant/
60 stars 25 forks source link

Error and question regarding alignSpectra function #57

Closed foellmelanie closed 6 years ago

foellmelanie commented 6 years ago

Hey, I am trying to use the “alignSpectra” function for m/z calibration of mass spectrometry imaging data. The data are either read in as imzML files or directly constructed from the mass spectra by using “createMassSpectrum”. The reference m/z is created with “createMassPeaks” and contains only our 5 calibrant m/z. Despite the warning that the reference MassPeaks object contains very few peaks we are satisfied with the alignment of the spectra. Unfortunately for some datasets I obtain the following error: "Error in approxfun(x = lo$x, y = lo$y, rule = 2L) : need at least two non-NA values to interpolate" What does this mean, is this an m/z or intensity problem?

For other datasets I get a different error: "Error in FUN(X[[i]], ...) : Could not match any peak in spectrum 289 to a reference peak." This is probably due to bad quality spectra that do not have the calibrant peaks. Is there a trick to make the “alignSpectra” function skip those bad quality spectra instead of throwing an error and not getting any aligned spectrum back?

Looking forward to get some hints :) Best, Melanie

sgibb commented 6 years ago

In fact both errors are nearly the same: too few peaks in some samples to align the spectra to the reference

Currently there is no possibility to run alignSpectra or the underlying functions determineWarpingFunctions/warpMassSpectra while ignoring poor spectra/peak lists. I could introduce an argument ignoreNoMatch (if you find a better name you are welcome) that turns the error into a warning and would align just the spectra that have at least 2 matches against the reference (the bad quality spectra won't change, or could be replaced by empty ones).

Do you thing that would be helpful?

foellmelanie commented 6 years ago

Thank you very much for your answer. The information helps for now as I can filter out bad quality spectra without reference peaks beforehand. An ignoreNoMatch argument would be very useful, ideally giving the user the option to decide if bad quality spectra should be kept as they are or if they should be replaced by empty ones.

Thanks again! Melanie

sgibb commented 6 years ago

Sorry for the delay. Now MALDIquant::alignSpectra supports an allowNoMatches and emptyNoMatches argument. I will upload it to CRAN shortly (if you can't wait you can install it with devtools::install_github("sgibb/MALDIquant") ).

As example for the new functionallity:

Preparation:

library("MALDIquant")

packageVersion("MALDIquant")
# [1] ‘1.18’

## load example data
data("fiedler2009subset", package="MALDIquant")

## running typical workflow on a small subset

## transform intensities
spectra <- transformIntensity(fiedler2009subset, method="sqrt")

## smooth spectra
spectra <- smoothIntensity(spectra, method="MovingAverage")

## baseline correction
spectra <- removeBaseline(spectra)

## modify spectrum as toy example
spectra[[2]]@mass <- seq(1, 1000, length.out=42388)

New alignSpectra functionallity:

## default behaviour
alignSpectra(spectra)
# Error in FUN(X[[i]], ...) :
#  Could not match any peak in spectrum 2 to a reference peak.

alignedSpectra <- alignSpectra(spectra, allowNoMatches=TRUE)
# Warning message:
# In FUN(X[[i]], ...) :
#  Could not match any peak in spectrum 2 to a reference peak.

## second spectrum not modified (not warped)
alignedSpectra[1:2]
# [[1]]
# S4 class type            : MassSpectrum
# Number of m/z values     : 42388
# Range of m/z values      : 999.968 - 9999.938
# Range of intensity values: 0e+00 - 2.463e+02
# Memory usage             : 671.938 KiB
# Name                     : Pankreas_HB_L_061019_G10.M19
# File                     : /data/set A - discovery leipzig/control/Pankreas_HB_L_061019_G10/0_m19/1/1SLin/fid
#
# [[2]]
# S4 class type            : MassSpectrum
# Number of m/z values     : 42388
# Range of m/z values      : 1 - 1000                    ## not modified
# Range of intensity values: 0e+00 - 2.559e+02
# Memory usage             : 671.938 KiB
# Name                     : Pankreas_HB_L_061019_G10.M20
# File                     : /data/set A - discovery leipzig/control/Pankreas_HB_L_061019_G10/0_m20/1/1SLin/fid

alignedSpectra <- alignSpectra(spectra, allowNoMatches=TRUE, emptyNoMatches=TRUE)
# Warning message:
# In FUN(X[[i]], ...) :
#  Could not match any peak in spectrum 2 to a reference peak.

## second spectrum is empty now
alignedSpectra[1:2]
# [[1]]
# S4 class type            : MassSpectrum
# Number of m/z values     : 42388
# Range of m/z values      : 999.968 - 9999.938
# Range of intensity values: 0e+00 - 2.463e+02
# Memory usage             : 671.938 KiB
# Name                     : Pankreas_HB_L_061019_G10.M19
# File                     : /data/set A - discovery leipzig/control/Pankreas_HB_L_061019_G10/0_m19/1/1SLin/fid
#
# [[2]]
# S4 class type            : MassSpectrum
# Number of m/z values     : 0
# Range of m/z values      : NA
# Range of intensity values: NA
# Memory usage             : 671.938 KiB
# Name                     : Pankreas_HB_L_061019_G10.M20
# File                     : /data/set A - discovery leipzig/control/Pankreas_HB_L_061019_G10/0_m20/1/1SLin/fid

## remove empty spectra
removeEmptyMassObjects(alignedSpectra)
## just 15 spectra left, 2nd removed
foellmelanie commented 6 years ago

absolutely perfect! Thanks a lot!

foellmelanie commented 6 years ago

Hi again,

the allowNoMatches works nicely but for the emptyNoMatches argument, I do not get empty objects. For alignedSpectra[1:2] I have still a number of m/z values as well as intensities but the range of m/z values is NA-NA. Did I miss something?

alignedSpectra<- alignSpectra(maldi_data, halfWindowSize=20, SNR=2,tolerance=0.01, warpingMethod="linear", reference = mass_list, emptyNoMatches = TRUE, allowNoMatches =TRUE) Thanks again, Melanie

sgibb commented 6 years ago

That is expected. The emptyNoMatches = TRUE argument set the intensity values to zero. So you have a spectrum with the same length and mass values as before but with all intensity values zero (if that is wired I could change that, but it maybe will break some code, especially for plotting). You could use sapply(spectra, isEmpty), findEmptyMassObjects(spectra) or removeEmptyMassObjects to find or remove these "empty" (zeroed) spectra.

foellmelanie commented 6 years ago

Hi, all good now. This was probably some pre-holiday confusion...sorry...